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Preface to the Second Edition 


The purpose of the second enlarged edition is to complement the previous overview 
of the relations between entropy, information and dynamics in classical and quan- 
tum systems with some issues that have emerged since the previous edition and 
that I find relevant to the main theme of the book. The corresponding new addi- 
tions concern the dissipative dynamics of open quantum systems and the flow of 
information between them and their environment with and without memory effects; 
the use of entanglement in quantum metrology and its persistence at a mesoscopic 
level of collective quantum fluctuations and, finally, the recent coming together of 
Machine Learning and Quantum Mechanics with the comparison of the storage 
capacity of simple classical and quantum perceptrons. 

The perspectives provided by this new edition much owe to my scientific 
collaborations through the years since the previous one: first of all, with my col- 
league and friend Roberto Floreanini, and then with Federico Carollo, Dariusz 
Chruscifski, Ugo Marzolino and Stefano Mancini. Working with all of them has 
constantly provided me with neat examples of how beautifully and fruitfully rig- 
orous mathematics and physical intuition can work together and how necessary it 
is that they do. 

My particular thanks also go to Samad Khabbazi Oskuei and Ahmad Shafiei 
Deh Abad for their interest in the first edition of this book. 

Last but not the least, my deep love goes to my wife Paola who has always 
been on my side during the entire enterprise. 


Trieste, Italy Fabio Benatti 
August 2023 
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Preface to the First Edition 


Aim of this book is to offer a self-consistent overview of a series of issues relating 
entropy, information and dynamics in classical and quantum physics. My personal 
point of view regarding these matters is the result of what I had the good fortune 
to learn in the course of the years from various scientists: Heide Narnhofer in 
the first place, who introduced me to quantum dynamical entropies and was a 
precious guide ever since, then Robert Alicki, Mark Fannes, Giancarlo Ghirardi, 
Andreas Knauf, John Lewis, Geoffrey Sewell, Franco Strocchi, Walter Thirring, 
Armin Uhlmann. All of them have been to me a constant example of rigorous 
mathematics and physical intuition jointly at work. 

Last but not least, my deep gratitude goes to my family and to the many friends 
on whom I could always count for support and encouragement with a special 
thought for Traude and Wolfgang Georgiades. 


Trieste, Italy Fabio Benatti 
August 2008 
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Introduction 


Focus of the first edition of this book was quantum dynamics in connection with 
correlations as carriers of information and entropy as a measure of information. In 
the same spirit, this new version takes into account some of the many developments 
that much enriched quantum information theory since the first edition and that I 
consider relevant for the theme of dynamics and information in the quantum realm. 

One recent development concerns the interrelations between Machine Learning 
and Quantum Information. Machine Learning aims at building methods able to make 
predictions and take decisions, based on sample data, without being explicitly pro- 
grammed to do so. Quantum Information studies the storage and transmission of 
information encoded in quantum states. Nowadays these two disciplines are becom- 
ing intertwined giving rise to the field of Quantum Machine Learning. The relations 
between them are of two different kinds: on one hand, Machine Learning can be 
used to efficiently study quantum systems and quantum dynamics. On the other 
hand, quantum devices can be used to bring quantum advantages to Machine Learn- 
ing in terms of higher storage capabilities and increased information processing 
power. The research on possible quantum advantages in Machine Learning is still 
in its infancy. However, its potentialities are undoubtable and the major additions to 
this new edition are Sect. 3.3, which contains a brief introduction to the notion of 
Perceptron and to the statistical approach to its storage capacity, and Sect. 7.8 which 
illustrates a recent model of quantum Perceptron, computes its storage capacity and 
shows that no advantages are to be expected at least in that particular case. 

Other important advances concerning dynamics and information are those rela- 
tive to quantum correlations, that is entanglement, and the time-evolution of open 
quantum systems that experience dissipation and noise due to the presence of an envi- 
ronment. There, interesting phenomena appear when one goes beyond the standard 
memoryless paradigm and considers memory effects. These are indeed accompa- 
nied by information coming into the open system from the environment and not only 
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going out from the open system into the environment. Some puzzling aspects of this 
flow of information in the so-called non-Markovian scenario have been discussed in 
Sect. 5.6.6. 

The role of entanglement and its underlying non-classical correlations is of fun- 
damental importance in quantum information theory; the space dedicated to it has 
been enlarged in three respects. In Sect. 6.4 we have considered its use in quantum 
metrology as a means to increase the estimation accuracy of parameters encoded in 
quantum states, emphasizing the relation of entanglement to non-local correlations 
and the change of attitude required when one deals with identical particles. The new 
section added at the end of Sect. 6.2 addresses instead the possibility that a dissipa- 
tive open dynamics might be employed to generate entanglement within a bipartite 
two-qubit system via a suitable engineering of the coupling to the environment in 
which they are immersed, thus going against the expectation that openness and deco- 
herence can only be detrimental to entanglement. Finally, in Sect. 7.7 an overview 
of the theory of quantum fluctuations is provided as a means of keeping quantum- 
ness at a mesoscopic level in between micro- and macro-physics. Namely at the 
scale of many-body quantum circuits that are of great interest for the concrete imple- 
mentation of quantum computation. Quantum fluctuations are collective observables 
scaling with the inverse square-root of the number of particles for which one can 
find the persistence at the mesoscopic level of the entanglement generated by suitable 
dissipative microscopic dynamics. A part from these just mentioned additions, the 
structure of the book remained the same. 

For classical dynamical systems, the notion of dynamical entropy was introduced 
by Kolmogorov and developed by Sinai (KS entropy) and provided a link among 
different fields of mathematics and physics. In fact, in the light of the first theorem 
of Shannon, the KS entropy gives the maximal compression rate of the informa- 
tion emitted by ergodic information sources. A theorem of Pesin relates it to the 
positive Lyapounov exponents and thus to the exponential amplification of initial 
small errors, in a word to classical chaos. Finally, a theorem of Brudno links the KS 
entropy to the compressibility of classical trajectories by means of computer pro- 
grams, namely to their algorithmic complexity, a notion introduced, independently 
and almost simultaneously by Kolmogorov, Solomonoff and Chaitin. 

In a previous book by the author, the notion of quantum dynamical entropy elab- 
orated by A. Connes, H. Narnhofer and W. Thirring (CNT entropy) was presented 
within the context of quantum ergodicity and chaos. The CNT entropy is a partic- 
ular proposal of how the KS entropy might be extended from classical to quantum 
dynamical systems. 

After the appearance of the CNT entropy, other proposals of quantum dynamical 
entropies appeared which in general assign different entropy productions to the same 
quantum dynamics. The basic reason is that each proposal is built according to a 
different view about what information in quantum systems should mean. Concretely, 
it is a general fact that, in order to gain information about a system and its time- 
evolution, one has to observe it and a quantum fact that observations may be invasive 
and perturbing. Should this fact be considered inescapable and thus incorporated in 
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any good quantum dynamical entropy or, rather, should it be avoided as a source of 
spurious effects that have nothing to do with the actual quantum dynamics? 

This is an unavoidable question and, based on the possible answers, one is led to 
different notions of quantum dynamical entropies. These will be sensitive to different 
aspects of the quantum dynamics and thus, not unexpectedly, not equivalent: the real 
issue is which these aspects are and what kind of informational meaning they do 
posses. 

In view of the role of the KS entropy in classical chaos, one of the principal 
applications of the quantum dynamical entropies has been to the phenomenology of 
quantum chaos. The scope has now become wider: quantum compression theorems 
and recent attempts at formulating a non-commutative algorithmic complexity theory 
motivate the study of whether and how the different quantum dynamical entropies 
are related to these new concepts. In particular, a better understanding of the many 
facets of information in quantum systems may come from clarifying the relations 
of the various quantum dynamical entropies among themselves and their bearing 
on quantum compression schemes and the algorithmic reproducibility of quantum 
dynamics. 

The issue at stake can be conveniently conveyed by an example: the simplest 
classical ergodic information source emits bits independently of each other with 
probabilities 1/2 for both 0 and 1. The KS entropy is log 2 and represents 


1. the information rate of a classical source emitting independent bits; 

2. the Lyapounov exponent of the classical dynamical system consisting in throwing 
a fair coin; 

3. the algorithmic complexity of almost every resulting sequence of tails and heads. 


The quantum counterpart of such an information source is a so-called quantum spin 
chain, that is a one-dimensional lattice carrying a 2 x 2 matrix algebra at each of 
its infinitely many sites: each site carries a so-called qubit . The dynamics of such a 
system is just the shift from one site to the other and the infinite dimensional algebra 
of operators is equipped with a translation-invariant state. These non-commutative 
structures have recently become of primary importance in the boosting field of quan- 
tum information. What is relevant is that one can construct subalgebras of quantum 
spin chains characterized by varying degrees of non-commutativity between their 
operators. Depending on that degree, the CNT entropy, varies between zero and 
log 2, while another quantum dynamical entropy, the AFL entropy of Alicki, Fannes 
and Lindblad, is always log2. The CNT entropy thus appears to be sensitive to 
the amount of non-commutativity between operators, whereas the AFL entropy is 
apparently independent of that structural algebraic property. 

Because of its unifying properties, the KS entropy can be taken as a good indicator 
of classical randomness and complexity; one would then like to assign a similar role 
to the quantum dynamical entropies. Does this mean that, in accordance with the CNT 
entropy behavior, quantum dynamical systems have varying degrees of complexity 
or randomness depending on the degree of non-commutativity? Or, according to 
the AFL entropy, the algebraic structural properties have no bearing on dynamical 
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randomness or complexity, which are rather related to the statistics of such systems, 
namely to their shift-invariant state? 

More concretely, one may ask which one of the two quantum dynamical entropies 
is closer to the actual quantum informational structure of these quantum sources. 
Regarding this issue, of particular interest are the yet unexplored relations of the 
quantum dynamical entropies to the quantum algorithmic complexities. 

Indeed, as there inequivalent generalizations of the KS entropy, so there are dif- 
ferent extensions of the classical algorithmic complexity. These extensions have 
been motivated by the possibility of a model of computation based on the laws of 
quantum mechanics and on the theoretical formulation of the notion of Quantum 
Turing Machines (QTMs). Like Classical Turing Machines (TMs), QTMs consist of 
a read/write head moving on tapes with, say, binary programs written on them. Only, 
the tapes of QTMs can occur in linear superpositions of the classical configurations 
of 0’s and 1’s. In a word, inputs and outputs of QTMs are qubits. 

Since the various quantum dynamical entropies were proposed, independently of 
quantum information, as tools to better study the long-time dynamical features of 
infinite quantum systems, one may doubt that relations should exist between them 
and quantum information. One notices, however, that the CNT entropy was devel- 
oped using the notion of entropy of a subalgebra which, years later, independently 
appeared in quantum information theory as a measure of entanglement known as 
entanglement of formation. Also, the AFL entropy is based on techniques that in 
quantum information theory are fundamental tools to describe quantum channels 
and, more in general, all quantum operations that may affect quantum systems. 


The book is organized in three parts. 

In the first part, the first chapter presents basic notions of ergodic theory, the second 
gives an overview of entropy in information theory, the third addresses the notion 
of KS entropy, the classical compression theorems and some aspects of Machine 
Learning, in particular the classification capacity of a simple perceptron. Algorithmic 
complexity is instead the subject of the fourth chapter. 

The second part consists of three chapters; the first offers an overview of algebraic 
quantum mechanics with particular emphasis on the notions of positivity and com- 
plete positivity of quantum maps and quantum time-evolutions, both reversible and 
irreversible without and with memory effects. The second chapter introduces the fun- 
damentals of entanglement and quantum correlations and of their use in metrological 
contexts; the basics of quantum information with particular emphasis on the rela- 
tions between positive and completely positive maps and the dynamics of quantum 
entanglement; the entropy of a subalgebra and its connections with the entanglement 
of formation and the accessible information of a quantum channel. The third con- 
cerns infinite quantum dynamical systems and quantum ergodicity, quantum chains 
as quantum sources with the quantum counterparts to Shannon’s theorems, quan- 
tum fluctuations and the emergent non-commutative collective behaviour at the level 
of mesoscopic physics, continuous variable quantum perceptrons and their storage 
capacity. 

In the first chapter of the third part, a detailed introduction is given to the CNT and 
AFL entropies and to their use in the study of dynamical information production in 
quantum systems. Finally, the second and last chapter of the book focusses on some 
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recent extensions of algorithmic complexity to quantum systems, starting with a 
discussion of quantum Turing machines and quantum computers and concluding with 
an exploration of the possible role played in this context by the quantum dynamical 
entropies. 

The topics addressed come from rather different fields that only recently, because 
of the birth and rapid development of quantum information, quantum communica- 
tion and computation have started to overlap. This book has been written not as an 
introduction to any of these topics (of which exhaustive presentations do exist in 
plenty), rather as an attempt to provide readers with expertise in some, but not in all 
of the topics, with a self-consistent overview of these many subjects. Therefore, care 
has been taken to give proofs of almost all of the results that have been used, apart 
from basic and standard facts, and to illustrate them by means of selected examples. 


Part | 


Classical Dynamical Systems 


In the first part of the book, classical dynamical systems are presented from various 
perspectives, among them not only those of ergodic, information and algorithmic 
complexity theory but also that of Machine Learning. 

Ergodic theory studies the clustering properties of equilibrium states; in infor- 
mation theory, the central notion of entropy is used to quantify the degree 
of predictability of phase-space trajectories, while algorithmic complexity the- 
ory quantifies their randomness in terms of how easily they can be described 
by algorithms. Simple perceptrons as fundamental operative units for Machine 
Learning techniques are finally considered in relation to their capacity of storing 
information. 

The purpose of the presentation is to set up a suitable algebraic framework 
that can be used to extend these various points of view to quantum dynamical 
systems, thus facilitating a comparison and the possibility of checking advantages 
and disadvantages inherent in the quantum setting. 


® 
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Classical Dynamics and Ergodic Theory 


In this chapter the term classical dynamic al system will broadly refer to one- 
parameter families of transformations, or dynamical maps, T, acting on a phase space 
Æ whose points x describe the system degrees of freedom. In physical applications, 
x identifies an initial state, or configuration, T;x the resulting state or configuration 
after a span of time of length t. If t is discrete, t € Z, one speaks of a reversible 
time-evolution through discrete time steps with trajectories {7;x},;<z consisting of 
countably many configurations at negative and positive integer times. If £ € N, this 
means that the dynamics can only develop forward in time and is thus irreversible. 
In the case of a continuous-time dynamics, trajectories through x € ¥ at t = 0 are 
continuous sets {7;x};eR of configurations if the dynamics is reversible, otherwise 
trajectories are only forward in time, {T;x }eR+. 

Once the description of a system by means of a phase-space ¥ has been cho- 
sen, any phase-point x € ¥ contains all possible information about the system state. 
When all this information is not available, the state of a system amounts to a nor- 
malized positive measure on 1’, a probability distribution, such that the volume of 
a measurable subset gives the probability that x belong to it. Entropy quantifies the 
amount of information corresponding to such probability distribution, that is how 
informative the measure is about the actual state of the system. 

Beside the knowledge of the state of classical systems, information can also 
concern how states change in time, in particular, as regards foreseeing their behav- 
ior; the degree of predictability of dynamical systems is measured by dynamical 
entropies. Intuitively, regular time-evolutions should allow for reliable predictions, 
which are instead hardly possible for irregular dynamics; roughly speaking, irregu- 
larity is expected to correspond to the fact that the past does not completely contain 
the future. 

Information about the state or the time-evolution of physical systems can be 
obtained by measuring suitable quantities accessible to experiments. These quanti- 
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ties, called observables for short, correspond to functions on X. Unlike for quantum 
dynamical systems, for classical ones any measuring protocol can in principle be 
assumed not to interfere with the system observed, the basic reason being that clas- 
sical descriptions involve commuting objects, as functions on the phase-space V 
indeed are. 

Which observables are appropriate to describe a dynamical system depends on the 
structure of the chosen phase-space X; for instance, statistical descriptions require 
that V be endowed with a measure-structure, whereby measurable functions consti- 
tute appropriate observables. On the other hand, ¥ might be provided with a topology 
and typical observables would then correspond to continuous functions. 


2.1 Classical Dynamical Systems 


In this section we review some basic facts relative to classical dynamical systems 
mainly adopting a measure-theoretic point of view; in this way a minimum of con- 
straints is put on the mathematical properties of states, observables and dynamical 
maps and the emerging technical context is broad enough to describe a large variety 
of physical phenomena, from those typical of Hamiltonian mechanics to those better 
understood in terms of discrete dynamical systems. 


Definition 2.1.1 Classical dynamical systems are triplets (¥, T, p), where 


1. Æ is a measure space with an assigned o-algebra & of measurable sets; 

2. T is measurable, that is A € E > T~!(A) € E; 

3. X is endowed with a T-invariant, positive normalized measure ju, such that 
u(X) = 1 and uo T7! = u. 


Remarks 2.1.1 1. A collection £ of subsets S$ C ¥ is called a measure-algebra if 
(1) ¥ € È, (2) S € X implies V\S € X, where S1 \ S2 denotes the complement of 
the subset Sz relative to the subset $4, and (3) S; € È fori = 1, 2,...,, implies 
U;—ı S; € &. A measure-algebra © is a measure o-algebra if it is closed not only 
with respect to finite unions of its elements, but also with respect to countable 
unions, that is if U Sn € È for all {S, oie Sn E€ X. Since the complements 
of unions of sets are the intersections of the complements of the sets, namely 
X\(A U B) = (X¥\A) U (4\B), o-algebras contains infinite intersections of their 
elements, too. 

2. Let Xo be a measure-algebra, by adding to Xo infinite unions and intersections of 
elements of Xo one obtains a o-algebra © which is the smallest one containing 
Xo; such & is called the c-algebra generated by Xo. If the measure space ¥ is 
endowed with a topology, then, the o-algebra generated by the open subsets is 
known as Borel o-algebra and its elements as Borel sets [305]. 
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3. A positive function u : E +> R*, such that u(¥) = 1 is a probability measure 
on ¥ relative to a -algebra & if it is c-additive, namely if 


CO OO 
ESS sos; =0 =a (Üs) =$. 


n=1 n=1 


Notice that u is automatically monotone under inclusion, namely 
A C B => A = (A\B) U B => MA) = WA\B) + WB) > WB). 


4. The following criterion is rather useful: an additive positive finite map jz: X => 
R* is o-additive if and only if limp (Bn) = 0 for any collection {B,}°°, of sets 
Bn € È such that By+1; C Bn and fie B, = Ø. Indeed, suppose u is o-additive 
and {B,}°°., has decreasing properties and empty intersection; then, the sets 
Cn = By\Bn41 are disjoint and B, = |J}>„ Cx. It thus follows that u(B1) = 


yee (Cx), Whence 


im. u(Bn) = jim 2 MCW = =0. 


=n 


Vice versa, let ps be positive, finite and additive on £ and take any collection 
{C niaaa of disjoint subsets of X; because of additivity 


„(Uc Cx) = Duco + o( Ù ci): 


k=n+1 


> 
Il 
= 


Since Bn := (G aaeee Ck © By) and (),, Bn = Ø, o-additivity follows. If u is 
o-additive over a measure algebra Xo it can be extended in an unique way to the 
o-algebra X generated by Xp. In other words, given a S € X, for any £ > 0, there 
exists S’ € Xo such that u(S A S’) < £, where 


SAS = OS) U (S\S) = (SU SANCS NA S’). (2.1) 


5. A regular Borel measure on ¥ is a measure on the Borel o-algebra such that, for 
any measurable subset B and £ > 0, there exists an open, Us, and a closed subset, 
C-, with Ce C B C Ue such that u(U-\Cz) < £ [305,370]. 


Definition 2.1.1 provides an appropriate framework for irreversible dynamical 
systems in discrete time whereby the time-evolution of phase-points x € X consists 
in successively applying the dynamical map T to x so that trajectories are given 
by countable sets {T”x}nen. For reversible, discrete-time dynamics, also r=! 
assumed measurable, that is T (A) € X if A e X with po T = p; trajectories are 
then of the form {T”x}nez. 
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The measure u defines a probability distribution over X: if f: YH Risa 
measurable function (an observable of the system), its mean value is 


renee f, dyu(x) f(x) . (2.2) 


In particular, if A C ¥ is a measurable subset and 1,4 (x) its characteristic func- 
tion,! the volume 


L(A) := [ance xa, (2.3) 


has a natural interpretation as the probability that x € ¥ belong to A. We shall as 
well refer to these probability distributions as to the states of a classical dynamical 
system. In fact, in the case of a continuous phase-space, access to phase-points is 
practically never achievable; thus, one has to content oneself with the knowledge of 
how phase-points are distributed over 7. 

From a physical point of view, the fact that states jz are assumed to be T -invariant 
means that the statistical description of dynamical systems refers to equilibrium 
states. Interestingly, a measure-theoretical dynamical triplet can be represented in 
terms of a unitary operator on a Hilbert space [18,77]. 


Example 2.1.1 (Koopmann-von Neumann Formalism [209]) 


Let (X, T, p) be a measure-theoretic dynamical triplet. Finite additions and mul- 
tiplications of characteristic functions of measurable subsets A; C ¥ give the algebra 


G(X) of simple functions s = } `; ci 14, over X. Lebesgue-integration with respect 


to u defines a scalar product (51 | s2),, over G(¥), 
(silsady= Dp) ch S OLE) = Y h*c (APN Ad), 
ijy i f j j 
i,j i,j 


for 14 (x)1g (x) = lang (x). Further, by linearly extending the map defined by 14 > 
Ur 14 := 14 0 T =1,7-1(4), one gets a linear operator Ur on G(X). Since p o 
TT! = u, Ur preserves scalar products 


(Ursi | Urs) = JO CD* J w( TMA] A AZ) = (s1 192 )u- 
i,j 


Therefore, the Koopman operator Ur can be extended to an isometric implementa- 
tion of the dynamics (invertible and thus unitary in the reversible case) on the Hilbert 
space L? (X ) of square-summable functions on ¥, 


(Urb) = YTI) YY ELX), Weed. (2.4) 


l 1a(x) = 1 ifx € A, 14 (x) = 0 otherwise. 
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The spectral properties of Ur will turn out to be of particular relevance for ergodic 
theory (see Sect. 2.3). Using a bra-ket quantum like notation, we observe that: 


1. the identity function I(x) = 1 almost everywhere with respect to ju, is always an 
eigenvector of Ur with eigenvalue 1, Ur| 1) =| oT) = |/11); 
2. if 1 is a degenerate eigenvalue, then there exist constants of the motion Il 4 w~ € 
Li (4), Url) =|boT) =|); 
. mean values are scalar products, u(w) = ( 1 | w), for all a € Li, (¥); 
4. products of mean values amount to the matrix elements of the orthogonal projec- 
tion | 1 )( 1 | 


WwW 


uC) uP) = (W*| 111d), Vo, o ELX), (2.5) 


where ~* is the complex conjugate of 7. 


2.1.1 Hamiltonian Mechanics 


Hamiltonian mechanics is an important source of classical dynamical systems 
[17,18,352]. Systems with f degrees of freedom are described by a phase-space 
which is a 2 f -dimensional manifold Mf C Rf x Rf whose points r = (q, p) con- 
sist of positions g = (q1, ..., qf) € RS and momenta p = (p1, ..., Pf) € Rf. The 
phase-space inherits a symplectic geometry from the symplectic matrix JJ = (si j] = 
e i o ) where O + and Il ¢ are the f x f zero and identity matrices, respectively. 
FT l 

Via the symplectic matrix one defines the Poisson brackets of two (differentiable) 
functions F, G : Mf > R, 


<L aF) , OG(r) 


F = f 
{F, GXP) ar, 


(2.6) 
i,j=1 


With respect to them, q and p are canonical coordinates: {qi , pj} = ôij and the 
time-evolution is generated by the Hamilton equations 


dq d 

q TEN), q 77%), 

where H = H (r) is a (time-independent) Hamiltonian or energy function of the 
system. They are solved by the Hamiltonian flux r œ> r(t) = pH (r), t € R.? The 


2 One can always extract a discrete time-evolution {T”}nez from it by fixing t = 1 and setting 
T := OF 
1 
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time-evolution of functions F on My then amounts to a group of dynamical maps 
F > F, := F o # that solves the time-evolution equation 


dF, (r) 
dt 


={F,, HX(r) . (2.7) 


Suppose My = R?/; then, a natural c-algebra for the phase-space Mf is the 
Borel o-algebra (see Remark 2.1.1.2) containing all open subsets of the topology of 
My given by the Euclidean distance. The Liouville measure dr = nee , dqidpi is 
invariant under the Hamiltonian flux oH ; however, f My dr diverges and cannot be 
normalized to a probability distribution. A way out typically occurs when there are 
constants of the motion, that is functions F on Mp, like the Hamiltonian itself, such 
that {F , H} = 0. By fixing their values, the dynamics is restricted to time-invariant 
submanifolds that usually have finite volumes. 

Instances of equilibrium states leading to descriptions of Hamiltonian systems as 
measure-theoretical triplets (Mp, pH , HH) (discrete time), or (Mp, {0# hier, HH) 
(continuous time), are in general provided by probability distributions duy (r) = 
f(r)dr ,where f : M f +> R* is anormalized, positive functions such that { f , H} = 
0. Prominent instances of such probability measures are the micro-canonical, canon- 
ical and grand-canonical states of classical statistical mechanics [353]. 

The time-invariance of states as the previous ones deserves to be examined in 
some more detail as it follows from a duality argument which we shall frequently 
encounter in the following. Duality is essentially the observation that the mean value 
of a function F at time t, F;, with respect to a state u equals the mean value of F 
with respect to the state pi; at time t, u(F;) = u: (F). This defines the time-evolution 
of states as the dual of the time-evolution of observables (functions); indeed, from 
time-invariance of the Liouville measure it follows that 


LY) =f dr u(r) F(®¥ (r)) al dr uo" r) F(r) =: uF), (2.8) 
Mf Mf 


whence ju; := u o DH, solves the time-evolution equation 


Ohir (r) 
Ot 


={H, w}(r). (2.9) 


Example 2.1.2 (Regular Motion) Consider two uncoupled harmonic one- 
dimensional oscillators described by r = (q1, q2, P1, p2) E€ M2 = R* and by the 
Hamiltonian 


2 2 2 2 
Pi MW, 2 P2 M26) 2 

H(r) = — — z 
"=m a a tae a Y 
~ m 


Hı (r) H(r) 
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By fixing the single oscillator energies Hj(r) = E;, i = 1,2, the motion devel- 
ops on the 2-torus T? := {0 = (01, 02) : 6; € [0, 27)}, where it amounts to a two- 
dimensional rotation. Indeed, setting 


2 
Ji(r) := Ay(r)/w; = ki + —“q?, tanð; := 2 hence 
2M jw; 2 miwiqi 
2Ji , 
qi= cosĝ;, pi =Ẹy~2m;wiJi sind; , 
miwi 


one gets angle-action variables (0, I), 0 = (01, 02), I := (h, h). 
These are canonical coordinates, with Poisson brackets {6; , Jk} = ik; moreover, 
H(r) = K(J) = wi Jı + w2J2. Thus, the corresponding Hamilton equations, 


dé dJ 
zoa y Sa i 0 ’ = ’ ’ 
de de w= 1, 42) 
are solved by the Hamiltonian flux 
T,: 0% Ot) :=7T,(0@)=O+ut. (2.10) 


By varying E4, E2 and thus Z, the phase-space R* is covered 2-dimensional tori that 
do not intersect. On each fixed torus, d@/(27)* gives a probability measure which 
is invariant under the Hamiltonian flux. The triplet (T2, T := T, d0 JOm fulfils 
the requirements in Definition 2.1.1. 

In the Koopman-von Neumann formalism, the unitary operator Ur implementing 
T on H := L?(T?, d@/(27)7) has the exponential functions e„(0) = exp(in - 0), 
n € Z2, as eigenfunctions , 


(Uren) (0) = en (0 +w) = e Ejam Ote) = gi Eja 6B). C10 


Therefore, the time-evolution of H 3 |W) = X nez2 w(n)| en ) is given by 


IY) = US|) = Y Din) ef Zi=1" Len), keZ. (2.12) 


neZ? 


Remarks 2.1.2 


1. f = E, p,q € N, trajectories close since 0(2qr/w1) = 0 mod 27. 
wI 
2. If there are no 0 Æ n1 2 € Z such that niwi + n2w2 = 0, then, every trajectory 
{0(t)}rer fills the 2-torus T? densely. Namely, for any £ > 0, œ, 0 € T?, there is 
t € Rsuch that |0 (t) — p)I| < £, where the norm is the Euclidean norm computed 


mod 27. Indeed, using (2.10), 


tx = (Q1 — 01) /wy => (te + 2n7/w1) = ġı mod 27 , 
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for all n € Z. Since T is compact, the sequence {62(t, + 2n7/w )}nez has accu- 
mulation points; thus, for any € > 0 there exist n, p € N such that 


XG + 2(n + p) m/w) — O2(te + 2nm/w1)| = 2pr ŽŽ mod27 < €, 
wI 


whence the sequence {02 (tx + 2npr/w1)}nen subdivides the circle into disjoint 
intervals A, of length 


lOo(te + 2n + Dpr/wi) = Oty + 2npr/w)| < e. 
Therefore, 
$2 € Am = ||O(ts + 2mpr/w21) — PI = |02 + 2pmT/w1) — do] < €. 


3. A similar argument as before shows that, in discrete time, trajectories {9 (n)}nez 
fill T? densely if and only if there are no integers n1,2 4 0 such that njw) + 
nw = 27p with Z > p # 0 [111]. 

4. Example 2.1.2 is a particular instance of the Liouville-Arnold theorem [17,18, 
352] on integrable Hamiltonian systems. Suppose a canonical system with f 
degrees of freedom possesses f global constants of the motion K;, Kı := H 
in involution, that is {K;, Kj} =0, i,j =1,2,..., f. If the subset Ng := 
{Ki(r) = ki :i = 1,2,..., f} E Mf is compact and connected and the differ- 
ential 1-forms dK; are linearly independent on it, then Nx is isomorphic to the f- 
torus T/. Moreover, there exists a canonical transformation from r € Nọ to angle- 
action variables (0 , J) such that the Hamiltonian flux of is isomorphic to an 
f-dimensional rotation on T/ with J-dependent frequencies: 0 (t) = 0 + w(J)t. 
Accordingly, the phase-space My foliates into disjoint f-tori which are filled 
densely by the trajectories {O(t)};eR when Tu niwi(J) = 0, nj € Z, only if 
all n; = 0. Tori such that Se, niwi(J) = 0 for 0 Æ n; € Z are called resonant 
and on them trajectories close. The independence of the oscillation frequencies 
w from the actions J in Example 2.1.2 is an exception due to the linearity of the 
Hamilton equations. 

Integrable Hamiltonian systems cannot behave too irregularly as their motion 
amounts to a multi-dimensional rotation over invariant tori. In order to increase the 
degree of irregularity, some constants of the motion must disappear in order to let the 
trajectories wander around according to less predictable patterns. In the following 
example, a constant of the motion is eliminated by means of a folding condition. 
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Example 2.1.3 (Hyperbolic Behavior [18,321]) Let p(t) denote the periodic delta 
function } „ez ô(n — t) with unit period and consider a free one-dimensional motion 
with periodic quadratic kicks, occurring with strength 3 € R, according to the pulsed 
Hamiltonian 


H = i + dp(0)8q?) 


A natural dynamical map T consists in updating the vector r = (q, p) on phase 
space from immediately after the n-th kick to immediately after the n + 1-th one; 
namely T : ry > rn+i1, where rn := (Gn, Pn) and 


qn := lim q(n+e), pa:= lim g(n+e). 
e>0+ e>0+ 


Integrating the Hamilton equations 


dq dp 

== — = —ô,(t ; 

7e y p© bq 

first between Tn +£ and T(n+1)— e€ and then between T(n+1)—e and 
T(n + 1) + €, yields 


n+l+e 


gin ti tey—ginte)= parte) | ds p(s) 
n+l—e 


p(n+1+e€)—p(n+£)=-—pq(n+!1). 


By letting £ > 07, the integral is of order £ and vanishes; thus, the dynamical map 
T reduces to a 2 x 2 matrix acting on R?: 


r= (4) >ar, ne a (2.13) 


Since det(A) = 1, the Liouville measure dr = dq dp is T-invariant. The eigenval- 
ues of A, 


11 _ 2-8+VBG-F _ 2-6+/G-27-4 


2 2 , 


are real with |a| > 1 when 8 < 0 or 8 > 4. The corresponding eigenvector | a+ ) 
identifies a direction in RÊ? along which lengths increase exponentially for n > 0, 
while they contract exponentially along the direction of the eigenvector | a_ ) relative 
to the other eigenvalue |a|~! < 1. This motion is called hyperbolic. 


For 8 = —1, À = (i >) is symmetric thus (a_ |a )} = 0 and, writing |r) = 
yla+) + d|a_), 


Irn l? = [1e 98e + [Pe 18%, (2.14) 
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where rn := A”r. Therefore, the norms of all vectors r 4 O increase exponentially 
while remaining on the hyperbolae selected by fixing a value of F (r) := q? — p? + 
qp. Indeed, one can directly check that F (rn+1) = F (rn), whence this function is a 
constant of the motion [141]. This is no longer true if one imposes a folding condition 
that forces the dynamics to develop on the two-dimensional torus T? := {IR* > r = 
(4, p) mod (1)}, namely if one defines the dynamical map 


Ta : T2 > rer, := (A”r mod 1) eT’. (2.15) 


Then, the resulting triplet (T2, Ta, dr) is as in Definition 2.1.1 and the map T is 
known as Arnold Cat Map [18]. 

More in general, one may consider the dynamics on the 2-dimensional torus T? 
generated as in (2.15) by a matrix 


a= (04) abedeZ:ad-be=1, latd|>2, (2.16) 


with eigenvalues at! e€ R. Since A need not be Hermitian, its normalized eigenvec- 


tors | a+ ) = (eas, are in general only linearly independent; one explicitly computes 
2+ 


ie 1 
a4 =bAs, ang = (at! —a)A4} where A4 i= 5+ (2.17) 


and expands R? > |r) = (;) = C4 (r)|a4 ) + C_(r)|a_ ) with 


xa2— — yay— yay — xa24 


C = , C(rn:= 2.18 
+(r) A (r) x (2.18) 
where 
ae ai+ a\-\ _ a2 
A := Det a a) =b(1 — a^) A444. (2.19) 
Then, the hyperbolic behavior shows up since 
Afir) =at C4) la+) + a™ C- (r) la) (2.20) 


and the absolute value of one of the eigenvalues a4 = 


atdtV(atd)—4 
2 


larger than 1. 
Consider now the Koopman operator Ua on H := L4, (T?); the orthogonal expo- 
nential functions 


en(r) := exp(2rin - r) (2.21) 
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are such that (A? denotes the transposed of A) 
(Un en)(r) = eatin (Ar) _ 2ni (ATn)-r _ eaTn(F) 5 (2.22) 
whence, setting Y(n) := (en |Y ) for all y € H, it turns out that 
(Ua) (n) = (en | Ua |H) = (ea-tal¥) = YAn). 


Therefore, Ug has no other eigenvector but eọ = Il: if Ual Y) = ul Y ) for y € H 
with |u| = 1; then, with | Y) = X „ezz Y(n)| en), 


(em|URW) = X Yn) (em learn) = DA Pm) = phm) , 


neZ? 


for any fixed m € Z?. Since Y(n) + 0 with |z| > 00, if G(m) # 0, then lim, % 
(A™Pm) = 0 because of hyperbolicity, while uP (m) oscillates. 


The exponential amplification of small errors that results from (2.14) (or form 
(2.20)) cannot hold for arbitrarily large n: In fact, ||r,|| < 2 so that the expansion 
is eventually counteracted by the folding condition in (2.15). Suppose |r) = e| a+), 
then ||rp|| = ce” 28% < ./2 increases until n < log(e—!/2)/(log a). 

This argument applies to any pair of initial conditions r !?; their distance ||r! — 
r°? || increases exponentially due to the expanding contribution from the component 
of r! — r? along | a, ) until the folding condition affects one of the two cartesian 
components of r! — r?. Notice however that the smaller is ||r! — r7||, the longer 
the amplification lasts. This observation allows the introduction of the notion of 
asymptotic divergence rate of initially close trajectories even when they develop 
on compact phase-spaces: these rates are known as Lyapounov exponents and are a 
measure of dynamical instability. 


Definition 2.1.2 (Maximal Lyapounov Exponent [128,239]) The maximal posi- 
tive Lyapounov exponent of a dynamical triplet (VY, T, p) equipped with a distance 
d(x, y) is defined by 


<2 il d(T”x, Ty) 
Am(x)= lim — lim log —————— 
n>oo n d(x,y)>0 d(x, y) 


Of course VY may be a multi-dimensional space and thus there might be more 
directions along which distances expand exponentially fast with exponents A(x) > 0; 
the intuitive picture behind the definition is that, for sufficiently small d(x, y), the 
distance at time n is such that 


d(T"x, T”y) = PMO d(x, y) (1 + Ofer) 


where A(x) < Am (x) [78]. 
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1. A rigorous approach to Lyapounov exponents can be found in [239]; here, we 
sketch a few basic facts (see [128,370]). Assume the phase-space Æ to be a 
compact manifold with a Coo differentiable structure, a Borel o-algebra and a 
Riemannian metric such that the tangent spaces 7, (4) atx € ¥ are isomorphic to 
IR‘ equipped with an Euclidean structure. The dynamics T : X > X is assumed 
to be continuous with continuous first derivatives, so that one can focus upon 
its linearization 7,(T) that maps the tangent space 7, (4) into the tangent space 
Trx(#). In particular, one is interested in the asymptotic behavior of ||7,(T”)]|| 
where, by the chain rule, 


T (T”) = Trn-1,(T) O Tpn-2y O+: Tx(T) š 


Let X be equipped with a T-invariant regular Borel measure js (see 
Remark 2.1.1.5); then, there exists a measurable subset B C ¥ with u(B) = 1 
and a positive measurable function s : B œ> R+ such that, given x € B, there 
are real numbers {AV ore AD (x) < AVT D (x), and linear subspaces of RK, 


VOEE VO = {0}, VO æ) c VUD), VE = RE, for which 
i | l 
a) lim = log |T (T®r]| = AP (x) forallr € Wj(x) := V(x) © VI-YD (x); 
n>% n : 


b) (x) is defined, measurable and T-invariant on the subset of x € B such 
that s(x) > j, that is AY (Tx) = AÙ) (x); 

c) (TVP (x) c VP (Tx) for all j < s(x). 

It thus follows that, if AY (x) < 0, the norms of all r € VO? (x) go to 0 expo- 
nentially fast with n — +00. On the other hand, if \) (x) > 0, the norms of all 
vectors r € V(x) o VOD (x) diverge exponentially fast. 

2. There can be more than one positive Lyapounov exponent thus more than one 
amplifying direction in space. In volume-preserving dynamical systems to any 
amplifying direction there corresponds a shrinking direction (amplifying in the 
past). 

3. On compact manifolds, the two limits in Definition 2.1.2 do not commute: the 
numerator is limited by compactness, whence the 1 /n limit vanishes if performed 
before letting d(x, y) > 0. 

4. If there is an intrinsic smallest distance 6 > 0 between points x, y € Æ and the 
largest possible distance A is finite, then the Lyapounov exponent is zero. This 
means that exponential separation or amplification cannot be extended beyond 
the logarithmic time-scale set by 6 e% < A. This gives a so-called breaking-time 


1 A 

[141] tg := 3 log y 
5. When the motion develops on a compact phase-space, the existence of posi- 
tive Lyapounov exponents is known as extreme sensitivity to initial conditions 


and provides a widely accepted definition of classical chaotic motion [271,321]. 
Notice that without the folding condition, also an inverted harmonic oscillator with 
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Hamiltonian H (r) = p?/(2m) — mwgq?/2 would show an exponentially fast sep- 
aration of initial conditions, though far less irregular and interesting than one on 
a compact manifold. 


2.1.2 Shift Dynamical Systems 


Phase-spaces with a finite or a countable number of states are typical either of systems 
which arise from suitable discretizations of otherwise continuous phase-spaces or of 
intrinsically discrete systems as cellular automata [78]. The first possibility arises in 
particular when the observations aimed at identifying the system state as a point of 
phase-space have a finite accuracy; then, one performs a coarse-graining of phase 
space into a certain number of regions whose volume is determined by the given 
accuracy and whose interior points are accessible only through observations of higher 
accuracy. As we shall see in later sections, in such a case, the system states are 
identifiable with the labels of the regions where the system state is localized and the 
dynamics corresponds to jumping from label to label rather than from point to point 
of the phase-space. 

Instead, the phase-space of cellular automata [19,78] is discrete from the start as 
they consist of copies of a same system (automaton) described by a d-valued function 
i, for instance, in the binary case i = 0 may be used to signal when an automaton is 
deactivated, i = 1 when it is activated. 

The phase-space Æ of a cellular automaton with N systems comprises d™ con- 


figurations (states) corresponding to finite strings iM) = (ij, i2,...,1N) € a” = 
{1,2,...,d}%. The dynamics is given in discrete time by a map T : a = ga? 


that updates the configurations from time n to time n + 1: iN X(n) be iN Yin +1). 
The state i(n + 1) of the k-th automaton at time n + 1 in general depends on the 
states of some or all other automata at time n. In the following, we shall focus 
upon a most simple class of cellular automata, that is shift dynamical systems 
[18,77, 197,370]. 

Let the space ¥ be the collection Qa := {0,1,..., dj of all sequences i = 
{ij} jen Of symbols from a finite alphabet i; = 1, 2,..., d. Each i can be interpreted 
as a configuration of a countable network of cellular automata, each of them being 
indexed by an integer j € N, with i; denoting its actual state among the d possible 
ones. Let To : Qq +> Qa be the left shift along sequences, 


(Tot); =tj+1, (2.23) 


and set i(n) := Tři: Tọ amounts to a rather trivial dynamics, namely to a deter- 
ministic updating whereby the state i ;(n + 1) of the j-th automaton at time n + 1 
depends only on (is equal to) that of its right nearest neighbor at time n: 


ij(n+ 1) := (Tti); = (Toi) iM) = ijy n) 


From the point of view of a fixed automaton, say the 0-th one, this kind of dynamics 
is typically like tossing a coin. Indeed, suppose the initial configuration i (0) of the 
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network is to be chosen randomly, according to a probability distribution where all 
automaton states occur with the same probability 27™. Because of the dynamics, 
this property is then inherited by the sequence {io (n)}nen of successive states of the 
0-th automaton. 

In order to provide the shift along binary sequences with a measure-theoretic 
formulation as in Definition 2.1.1, the set Q4 of infinite sequences has to be equipped 
with a o-algebra of measurable sets. The standard way to do this is by means of the 
so-called cylinders [18,77, 111, 197]; they consist of all sequences whose entries have 
fixed values within chosen intervals: 


[j,k] PES E T TE = — ji 
CP ai = [i e Qs ije Sijes C=01..k-j}. 24 
SE 
ik-j+1) 


They are labeled by the interval [j, k] and by the binary string i*—/*" of length 

k — j + 1 of assigned digits within that interval; each one of them can be obtained 
oa: : : . {¢} 

as a finite intersection of simple cylinders C;, , 


k-j 
j,k j+e £ ; ‘ ‘ 
CU = (eins Cy = Petes tp=e | (2.25) 
£=0 


We shall denote by C;j,4] the sets consisting of the 2—-J+) cylinders Ce yj 
The o-algebra & is obtained from all possible unions and intersections of simple 
cylinders. Further, pre-images of cylinders under T~! remain cylinders: in fact 


ley) = {i EQ: Tie a a {i E% : ie(1) = ig] = ie} 
— c+) 


ig ? 


(2.26) 


whence Ts is measurable with respect to X. 


Remarks 2.1.4 The left shift on unilateral sequences is not invertible; it becomes so 
by choosing instead of Qq the set QZ of all doubly infinite sequences i = {ij} jez. 
Then, the same result as in (2.26) holds for the pre-images of cylinders under Tg, 


£ £—1 
C=C," 


, >» Whence T7! is also measurable. 


We shall refer to any probability measure u on È as to a global state on Qq; to any 
such y there correspond local states ujj x] on the cylinder sets Ct j,k]. As cylinders in 


Ci; k] are in one-to-one correspondence with strings ik) e Q% +) of length 


k — j + 1, these local states are probability distributions on Q% +1), 


cae fp; Gost 
Hj. = | ua IEE 


puai) > 0, puai) =1. (2.27) 


een 
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Consider the sequence of local states { UP }neN, 


HO = PPE megn POE = pum”), (2.28) 


on the cylinder sets Cj1,n]; since c! [1n] -U c! [n+l] from the additivity of the 


iji2-ipn ipig--ipt? 


measure the following compatibility condition follows 


d 
pO Giz.. .in) = alor] = Sp" iiz.. ini). (2.29) 


i=l 


Particularly important global states over Qq correspond to shift-invariant probability 
measures 4, H © T7! = u. From 


[2,n+1] =l [1,n] [1,n] 
Hoe win )= n(T; (Cih. o = ue 


it then follows that 


d 
J paene DD a SpE) = n(T; enn ') 
i=1 
= a( ct i Mja p" Yin oe (2.30) 


As a consequence, if u is shift-invariant the probabilities assigned to cylinders 
ae i, depend only on the values i jij+1 ...ix defining the cylinder, but not on 


the interval [j, k]. 


Remarks 2.1.5 Interestingly, the conditions (2.29) and (2.30) defines a dynamical 
triplet (Q4, To, u) in the sense of Definition 2.1.1. This is the content of Kolmogorov 
representation theorem [313]: if X = {1,2,..., d}, the set Qq, as the infinite Carte- 
sian product 7 *° of countably many copies of ¥ can be equipped with the prod- 
uct topology which is the coarsest one with respect to which the projection maps 
Tj:i> ij ae continuous, namely the one generated by union and intersections of 
preimages T 1 (B) of sets B € & that are open with respect to the discrete topology 
of X. Then, ‘Qu is a compact set by Tychonoff theorem [296]. Namely, any open 
cover of Q also contains a finite subcover, whence in any collection of closed sets 
in Q with empty intersection there also exists a finite sub-collection with empty 
intersection [296]. 
Suppose one is given a collection of numbers p™ (i) as in (2.28) satisfy- 


ing (2.27); they assign volumes n(c! Ts = pO 0), and define local states on 
the measure algebras generated by these cylinders. If the quantities p™ (i) fulfil 
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(2.29), the local states extend to a positive, finite and additive function p on the o- 
algebra X generated by cylinders. In order to show that u is also c-additive and thus 
a measure, one uses Remark 2.1.1.4 and that each set in X the o-algebra is closed 
in the product topology. Therefore, given any decreasing sequence © D {Cy}, 
with empty intersection, compactness ensures that there exists a finite sub-collection 
{Cnj Viet such that am Cn; = Ø, whence limy—oo(Cn) = 0. 


Further, suppose that the quantities p“ (i) also fulfil (2.30), then it turns out 


that 
[j.k] 
(Cie) 


> p” (iiz „ijj... ik) 


i1i2..ij—1 


= $O pP Gie.. ee 


ig..ij-1 


for all €=1,2,..., j— 1. Therefore, aT a) = pe VG, ... ig) whence 
aee) = n(c Hi K J and the measure j is shift-invariant. 


L 
Example 2.1.4 (Bernoulli shifts) Consider a shift dynamical system (Q4, To, p); 
the simplest choice of local states :“) corresponds to product measures on ae 


n d 
p” id= [|p POZO, J p=. (2.31) 
j=1 


i=l 


These dynamical triplets are known as Bernoulli-shifts; if d = 2 and ¥ = {0, 1}, 
(Q2, To, p) amounts to repeatedly tossing a coin, possibly biased if the probabilities 
of head (0) and tail (1) are different. 


Example 2.1.5 (Markov Chains) Shift dynamical systems slightly more correlated 
than Bernoulli shifts are the so-called Markov shifts. Given the local states pu = 
{pC m) wego, the ratios 

d 


Pp (iriz: in) 
p®-)(iji2 - ++ in-1) 


P(in|i1i2 +++ in—1) = (2.32) 


define conditional probabilities for the n-th symbol to be i, if the previous n — 1 
ones are i; - - - in—1. The global state yz is said to possess the Markov property if and 
only if the following conditions occur: 
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P(inliti2 «++ in-1) = Pinlin—1) (2.33) 
d 
X` pili) =1 (2.34) 
i=l 
d 
YS Pl PW) = Pw. (2.35) 


j=l 


Condition (2.33) means that the conditional probabilities (2.32) depend only on in 
and on i,—; and not on the previous symbols, so that 


n—-1 


p” (irin--+in) = (T| pGerilid) p@). (2.36) 


l=1 


Therefore, local states ) are completely specified by the d x d matrix P = 
[p(Cnlin—1)] and the probability vector | p) = POH 

Because of (2.34), the matrix P is a stochastic matrix, namely its entries p(i|j) 
are positive and qualify as transition probabilities as they express the fact that the 
system cannot but remain in the same state or change into another one. It follows 
that condition (2.29) is satisfied, indeed from (2.36) 


d d n—2 
YP Gin) = YP pilin) (T| pGerilied) pad 
i=l i=l l=1 
n—2 


= (JI Plic+ilie)) pi) = pP Gita) 
c=) 


Further, because of (2.35), the probability vector is an eigenvector with eigenvalue 
1 of the matrix P and (2.30) is also satisfied, whence the local states 1) generate a 
global shift-invariant states on Quy. In fact, 


d n—-1 d 
Yo iain) = (J peril) X p@l Pw 
i=l l=2 i=l 
n—1 


= (JI plerilie)) PGD = p7? Gai ++ +in) 
t=2 


Notice that Bernoulli shifts are particular instances of Markov chains with transition 
probabilities p(i|j) = p(i) forall j = 1,2,...,d. 
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2.2 Symbolic Dynamics 


As already remarked, states corresponding to a continuous phase-space can only be 
identified with finite precision that is they can be located within subsets of small, but 
finite size, and cannot be further resolved. A typical case is when the finite accuracy 
available corresponds to the subdivision of the phase-space in a finite number of non- 
overlapping measurable subsets, namely to a coarse-graining of the phase-space V 
by means of a so-called finite partition [8,200]. 


Definition 2.2.1 (Partitions) 


1. A finite, measurable partition (partition for short) P of (4, T, u) is any collection 
of measurable subsets P; C ¥V,i € Ip, Ip an index set of finite cardinality, such 
that P; N P; = Ø fori # j and Vier P; = æ The subsets P; are called atoms. 

2. A partition P = {P; Help is finer than a partition Q = {OjVetg (Q coarser than 
P), symbolically Q < P, if the atoms of Q are unions of atoms of P: Q = 
Vierjcip P;, for all j € Ig. 

3. Given two partitions P = {Pi}jezp and Q = lo j fee lo? the partition P v Q = 
{Pi N Q j}icIp,jeIg is the coarsest refinement of P and Q. 


Example 2.2.1 ([77]) Quite often, ¥ is endowed with a o-algebra & which is 
generated by a measure-algebra Xo; it is then possible to approximate within € 
any finite X-measurable partition P = {P3} , by a finite &-measurable partition 
Q= {Qi}, with atoms Q; € Xo, in the sense that (see (2.1)) 4(P; A Qi) < ¢,i = 
1,2,...,d. Indeed, because of Remark 2.1.1.4, given ô > 0, for any P; € P one can 
find Q; € Xo such that u(P; A Q;) < 4; notice that P; N P; = Ø, thus x € Q; N oF 
and x ¢ P; yield x € Q; A P;, whence 


OLN OS OL AP,UO AP; => MOLN Q!) < 26. 
The sets Q; need not form a partition; however, let Q’ := Gis a 12N Q, which 
is such that u(Q') < d(d — 1) ô and set 


d-1 
Qi := ONO, i=1,2,....d-1, Qa:=a\JQ;. 


j=l 
These are atoms of a partition Q C Xo. Consider first the symmetric differences 


Qi A P;,i =1,2,...,d — 1; one has that, if x € Q; and x ¢ P;, then x € Q; A P;, 
while, if x € P; and x ¢ Q;, then x € Q; A P; orx € P; N Q’, whence 


Qi A Pi COU (Q; A Pi) => u(Q; A Pi) < (dd — 1) +1)ô. 
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Since Py = X \ ie. P; and (¥\A) A(4\B) = AAB, 
d-1 d 


1 d-1 


yields u(Qg A Pa) < (d — 1)(d(d — 1) + 1) 6, whence the result follows by choos- 
ing ô = (d—1)"'(d(d—1) + 1)7!e. 


The volumes u(P;) =: p(i) of the atoms of any partition P provide a discrete 
probability measure up := {u(P;)}icrp on P. While atoms in general change under 
the dynamics T, 


Pw TWP) := {x€ X: Tixe R} VWi>0, (2.37) 
their volumes do not for T is assumed to preserve p. 
Further, if P; N Pi, = Ø, then T~/ (P;,) O T~/(P;,) = Ø. Therefore, forall j € N 
(j € Zif T has a measurable inverse) 


Pl = TI (P) = {T7} (P) hierp (2.38) 


are partitions with the same probability distribution of P: up; = up. Further, par- 
titions at successive times are all refined by the partition 


n—l1 
PY := Pi =PV T(P) v -v T!" (P). (2.39) 
j=0 


If p := card(Ip), the atoms 


Pe = Pio A TOP O e THP) (2.40) 


l 


of P™ are labeled by strings OS iis on. We shall denote by 
(n) manh o. (n) 
Hp = [p (i ) = HPO) wea ’ (2.41) 


the probability distribution associated with P™ and consisting of the volumes of its 
atoms with respect to the given probability measure ju. 


Remarks 2.2.1 Notice that a phase-point x € ¥ belongs to the atom P.@ of P” 
if and only if T/x € Pj, for allO < j < n — 1. As a consequence, the atoms of PM 
contain all phase-points x € X whose trajectories {T/x} jez, successively intercept 
the atoms Pi; of P identified by the string iM = ioi] -++In—1 € a”. As an effect of 
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pa . i f ; nl 
the coarse-graining, segments of different trajectories {T/ x fs may correspond to 
asamei” e€ oF ): thus, as normalized volumes, the probabilities p“ (i) quantify 
how likely is it that different initial conditions give rise to a same segment of trajectory 


between (discrete) time j = 0 and j =n — 1. 


Lemma 2.2.1 Given a reversible dynamical system (X, T , p) and a partition P = 
{Piliclp, card(Ip) = p, the dynamics T : X œ> X corresponds to the left-shift 
(2.23) on sequences i € nF (see Remark 2.1.4). 


Proof Leti(x) € 97 be the sequence of atom labels corresponding to the trajectory 
{T/ (x)} jez with initial point x € 4. According to Sect. 2.1.2 and to (2.37), i; (x) = 
i; if and only if T/x € Pi,. Then, from (2.23), 


ij(Tx) =ij Tix € By e ijn = (Toi) j@) =i; 


Therefore, any coarse-graining of X by means of a partition P provides a descrip- 
tion of the dynamical triplet (X, T, u) in terms of the left shift on the sequences in 
o% . The segments of trajectories up to time n — 1 are in one-to-one correspondence 
with the sets of strings i ™ € QP and the probability distribution ug provides states 
over the cylinder sets C!9."—', By means of (2.40) and of the T -invariance of u, one 
shows that conditions (2.29) and (2.30) are fulfilled, whence the local states uo 
define a global shift-invariant state up over 0. By varying x € æ, the trajectories 
{Tix} ez gets in general encoded by a subset 97 Cc OF. 


Definition 2.2.2 (Symbolic Models) Given a partition P of X, the triplet (22, Tos 
Hp) provides a symbolic model for the dynamical system (7, T, p). 


Example 2.2.2 (Baker Map) The Baker map (see Fig. 2.1) is the invertible map of 
the two-dimensional torus T? = {x = (x1, x2) , mod 1} into itself given by 


X2 1 
(2x, > O<x,j<- 
Trý = 2 2 
B 1+ x2 1 
2x; — 1, -<x <l 
2 2 
X1 1 
(Z 2x2 0<x <- 
1 1 


T, x= ( 


The map Tg is measurable with respect to the Borel c-algebra of T? and preserves 
the Lebesgue measure du(x) = dxıdx2: altogether, one has a dynamical system 
described by the measure-theoretic triplet (T?, Tg, dx ). It is evident that, when 
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Fig. 2.1 Baker map 1 1 1 
Tz" Tg i 
——— — 2 

01 1 0 1 1 0 1 

4 2 
| 75" |? 

1 1 
t 
4 

0; 1 0 1 


n — +00, a sufficiently small distance between two points x and x + 6 increases as 
2” along the horizontal direction until it gets of order 1. Therefore, Definition 2.1.2 
gives log2 > 0 as maximal Lyapounov exponent of the Baker map; instead, small 
distances decrease exponentially along the vertical direction with the same speed so 
that volumes are conserved. 

Let w+ (x1) = {wi }i>0 and w_(x2) = {w_;}j>1 be the half-sequences consisting 
of the coefficients of the binary expansions of x1, respectively x2: 


ae La 
X1 = ear a E y ER 
i Zit! 2 2j 


j=0 j=l 


Setting w(x) := (w- (x1), w4(%2)) = {w; (x)} jez € 92 and using the mod 1 folding 
condition defining T?, it turns out that Tg is isomorphic to the left shift T, on Q2, 
namely wj(Tgx) = wj+1 (x). 

Further, the Lebesgue measure on T? corresponds to the uniform product mea- 
sure (2.31) on the o-algebra generated by cylinders. This can be seen as follows. 
According to Remark 2.25, cylinders are intersections of simple cylinders as oie 
and C l that correspond to the vertical rectangles Po = {x : 0 < xı < 1/2} and 
Pi = {x : 1/2 < xı < 1} and their images g l= T3’ (0, j € Z (see (2.26)). 
Under Tp i they get rotated into horizontal rectangles; successive applications of the 
Baker’s map split them into horizontal rectangles of half height, each one of them 
having as neighbors halved rectangles coming from the other initial rectangle. 


It turns out that cl [ia] a a horizontal rectangle of width 1 and height 27”+!; 


oir a 
a further intersection oe ch } provides the cylinder C hs wae cha _, corresponding to a 


horizontal rectangle of width 1/2 and height 27”+! whose area 2~”. These areas 
may only come from a product measure, 


n [0, 1 -T 1 1 . 
uí Ce h ( ye) i ( Ke?) Vi g 
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Therefore, the coarse-graining of T? given by P = { Po, Pı} provides the symbolic 
model (Q2, To, ug) for (T?, Tg, dx ). 


2.2.1 Algebraic Formulations 


In this section, instead of referring to phase-space trajectories, we shall consider 
classical dynamical systems from the point of view of their observables and of their 
time-evolution. By observables we mean suitable functions over the phase-space. 

It is convenient to consider complex-valued functions f : X +> C; their values 
f(x) can be inferred by measuring real, R( f), and imaginary parts, S (f). Further, it 
is reasonable to assume that functions f, g in a suitably chosen class of observables 
give observables in the same class under addition, (f, g) + (f+ g)(x) = f(x) + 
g(x), and multiplication either by scalars a € C, (a, f) œ> (af)(x) = af (x), or 
by another observable, (f, g) + fg, where fg(x) = f (x)g(x). 

In other words, it is a reasonable physical assumption to require that observables 
constitute algebras of functions on 1: these algebras are commutative for fg = gf. 
Physically speaking, there are no fundamental obstructions to the fact that classical 
measuring processes can, in line of principle, be performed without effects on the 
state of the measured system. As the measured values depend on the system state, 
it follows that measuring g and then f yields the same results as measuring f and 
then g. 

Also, it is practically convenient to approximate certain observables in the algebra 
by means of other observables that are in a certain sense close to them; we shall thus 
assume these commutative algebras of observables to be endowed with topologies 
and to be closed with respect to them; in particular, we shall consider algebras of 
observables where converging sequences of functions do converge to observables in 
the algebra. 

Like in Examples 2.1.2 and 2.1.3, in the following we shall assume ¥ to be 
compact in a metric topology and measurable with respect to the Borel o-algebra 
that contains all its open and closed sets. Then, a natural algebra of observables is 
provided by the continuous functions on ¥ [305]. 


Definition 2.2.3 Let ¥ be a compact metric space; C(A’) will denote the Banach 
*-algebra (with identity) of continuous functions f : ¥ +» C endowed with the uni- 
form topology given by the norm 

CX) > ft IFI = supl fo] sxe 4}. (2.42) 
Remarks 2.2.2 


1. If f,g € C(¥) and a € C then f + ag € C(¥) as well as fg € C(¥). Sums, 
multiplications by complex scalars, by continuous functions and complex con- 
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jugation x : f(x) > f*(x) all map C(¥) into itself. These facts make C(4) a 
*-algebra with a norm f +> || ||; indeed, 


Ifl=O> f=0, lafl =la lfi, If+gl <IFil+ igl, 


forall f, g € C(¥), a € C. This norm defines the uniform neighborhoods 
U(f) := {g E€ CX) : f= gl <e}, fec), (2.43) 


and equips C(¥) with a metric and a corresponding topology called uniform 
topology, Tu. 

2. A sequence { fn}nen C C(X) is a Cauchy sequence if, for any £ > 0 there exists 
N € N such that n,m > N => || fn — fmll| < £; since all Cauchy sequences in 
C(#&) converge uniformly to f € C(%), that is lim, || f — fall = O or lim, fn = 
f, C(&) is termed a Banach algebra . 

Also, | f* fll = IFI, A*I = fll and [I fall < If Il lgl for all f, g€ CCX); 
this makes C(¥) a C* algebra (see Definition 5.2.1). 

3. Because of assumed compactness of X, the identity function 1 (x) = 1 belongs 
to C(¥). When ¥ is not compact, one considers the *-algebra Co(¥) consisting 
of the complex continuous functions on ¥ vanishing at infinity. When equipped 
with the norm (2.42), Co(4’) is a Banach algebra, but the identity function does 
not belong to it. 


A description of dynamical systems by means of continuous functions is, however, 
too restrictive, in general. For instance, the corresponding C* algebras cannot contain 
observables related to yes/no questions like 


is the state localized within a measurable subset (region) A € ¥ or not? 


as these correspond to characteristic functions 14 of A which are only measurable and 
not continuous. The Koopman-von Neumann formulation of Example 2.1.1 offers 
a natural way to enlarge the algebra of observables. In a quantum-like notation, we 
shall denote by | 7) any function in LZ (X) and by (x |Y ) its value w(x) atx € X. 
Functions f € C(¥) can then be represented on Li, (¥) as multiplication operators 
Mg: 


(x| Mp lb) = fa), W EL). (2.44) 


In the following, we shall identify, C(¥) and its representation by multiplication 
operators, that is we shall identify M f and f. 
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Remarks 2.2.3 


1. The maps C (¥) > f œ> Ly (f) := |I| f| Y )l| are semi-norms. They define strong- 
neighborhoods on C(&), that is neighborhoods in the so called strong topology, 
Ts, 


U = foe c® F-D se sjan]. 245 


Since I(f — DW) < If — gl wll, g € Usuf) = g € Ue; therefore, 
every strong-neighborhood contains a uniform neighborhood and is thus a uniform 
neighborhood itself; in general, however, there can be uniform neighborhoods 
which are not strong-neighborhoods, so that the uniform topology is finer than the 
strong topology, 7; < Tu, namely, 7„ has more neighborhoods. Practically speak- 
ing, a sequence f, € C(4) converges strongly to f e C(¥), s — limn fn = f, 
if limy—+oo ICF — fr | VY) =0 YY e€ L? (¥), and, while all uniformly conver- 
gent sequences converge strongly, there can be strongly converging sequences 
which do not converge uniformly. 

2. If {fn}nen converges with respect to 7q, it converges also with respect to 7s, but 
not vice versa; it follows that the strong closure of C(#%), that is C(¥) together 
with all its possible strong limit points, is strictly larger than C(’). Indeed, it 
contains C(4’), simple functions and discontinuous functions f that may jump 
arbitrarily but only on sets of zero measure [305]. Equip ¥ with a o-algebra and 
a measure ju; then, 


I flloo = inffa > 0: ufte: fŒ 2a) =o}, 


where f is a measurable function on 4, defines a norm || - ||oo called essential 
norm. If || f loo < œœ, then | f(x)| > || fllo only on a set of zero measure; further, 
the following collection of measurable functions, 


LEW) = {Ff : Ifl < oo} , 


is a C* algebra with respect to the essential norm known as the algebra of essen- 
tially bounded functions. 

3. There is another topology on C(¥) which is inherited by its multiplicative action 
on Li, (X) and which is coarser than the strong topology, namely the weak topol- 
ogy, Tw < Ts < Tu. Itis generated by the semi-norms Ly y (f) := (o| Mf |v) I 
which defines the weak neighborhoods 


Ob 


U: (f) = fg E CX) : Loy f -9E 1<j<n}. (246) 


A sequence f, € C(4) converges weakly to f e C(¥), w — lim fa = f, if and 
only if lim |(¢|(f — fal )| = 0 for all Y, de LZ (8). As we shall see in the 
n—> o0 
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more general non-commutative context, the strong and the weak closures coincide. 
In the case of C (X) they give rise to Le (X) which has the structure of a so-called 
von Neumann algebra. 

4. Actually, Lr (X) can be generated as the strong closure on L? (X) of the algebra 
containing the characteristic functions of finer and finer partitions of X. More pre- 
cisely, one may consider a refining sequence {Pn }n>0, Pn < Pn+1, that generates 
the o-algebra of ¥ when n — +00. Each P, is a finite dimensional commutative 
algebra A, whose elements are the step functions that are linear combinations of 
the characteristic functions of the finitely many atoms of P,,; then, 


weak—closure 


Ly) = (UAn 


In order to complete the formulation of measure-theoretic triplets (V, T, ju) into 
an algebraic framework, one has to endow C (4) with a time-evolution corresponding 
to T and a map C(V) + C that play the role of u by assigning mean values to 
continuous functions. 

We shall consider invertible continuous dynamical maps T on ¥ and discrete-time 
dynamics. As in Example 2.1.2, any f € C(4) changes in time according to 


fe f(T'x) = foTi(x)= fix), tez. 


The map Or : C(¥V) => C (X), defined by ©7(f) = f o T is an automorphism of 
C(%); namely, it is invertible and 


Orla f + Bg=aOr(f)+6Org), Or(fg) =Or(f)Or@. (247) 


Moreover, ©7 preserves the uniform norm. 


Example 2.2.3 In the Koopman-von Neumann formalism where functions f € 
C(#) are represented as multiplication operators, the Koopman operator Ur imple- 
ments unitarily the automorphism ©7; indeed, using (2.4), for all Y% € Li, (X) and 
xEX, 


(x|Up fULb) = f(Tx)(Tx|Ufw) = f(Tx) (T! o Tx |Y) 
= (x (Or (fW). 


Notice that Ur cannot belong to C(.V), otherwise it would commute with all f € 
C(X) which would then be constant in time. 


Concerning the possible states over C (æ) (see (2.2)), we shall consider the space 
M(X) of regular Borel measures over ¥V (see Remark 2.1.1.5). The simplest instances 
of elements of M(X) are the evaluation functionals 6, : C(4’) > C, defined by 
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Ox(f) := f(x), for all x € X and f € C(¥). These functionals can be seen as inte- 
gration with respect to Dirac delta distributions and embody the fact that phase-space 


points are the simplest physical states: ôx (f) = | dy f) — x). By making 
x 


convex combinations of evaluation functionals one obtains more general positive, 
normalized expectation functionals over C(%). 

Actually, a theorem of Riesz [305] asserts that the action of any such functional 
is representable by integration with respect to a regular Borel measure in M (&). In 
view of the physical interpretation of states as positive functionals that assign mean 
values to observables, it makes sense to identify measures u € M(&) and states 
Wy : C(X) > C such that? 


A> f> wl f) = L du(x) f(x), Vf eC(a). (2.48) 


Remarks 2.2.4 


1. With two measures 41,2 on a measure space ¥ all convex combinations pui + 
(1 — p)p2 with p € [0, 1] provide other measures; therefore, the space of states 
of classical systems is a convex set. 

2. Given two measures u12 on Æ equipped with a o-algebra X, u1 is said 
to be absolutely continuous with respect to u2, pı < m2, if for any Be X 
L2(B) = 0 => nı (B) = 0. Then, there exists a positive f € LI (X) such that 
uı(B) = fi du2(x) f(x) for all B € X. The density f(x) is called Radon- 


Nikodym derivative and denoted by A If also u2 < jy then u1 and pz are said 


to be equivalent. Differently, pı and u2 are called mutually singular, pı L p2, if 
there exists B € & such that uı (B) = 0 while u2 (¥\B) = 0. 

3. According to Lebesgue decomposition theorem, given two measures u and m 
on X, there exists a unique choice of measures 41,2 and of p € [0, 1] such that 
u = pp + (1 — p)p2 with pı < m and u2 L m. 

4. If X = T? as in Example 2.1.3, then any L}, (T?) > p(r) > 0 with fiy dr p(r) = 
1 is the Radon-Nikodym derivative of a measure which is absolutely continuous 
with respect to dr . Vice versa, evaluation functionals 6,(f) = f(r) are singular 
measures with respect to dr. 


Finally, a measure u € M (&¥) is T-invariant if the corresponding mean values are 
time-independent, w,,(Or(f)) = w(f) or wy = wp o Or. Notice that (2.48) and 
(2.47) allows one to extend the state w, and the automorphism Op to the von Neu- 
mann algebra of essentially bounded functions Lr (¥). 


3 For sake of notational convenience, we shall sometime employ the notation (f) for wp (f). 
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Definition 2.2.4 To any measure-theoretic triplet (Vv, T, u), where ¥ is a compact 
metric space equipped with the Borel o-algebra and u € M(¥) is a T-invariant 
regular Borel measure, one can associate a C* algebraic triplet (C (¥), Or, Wp) and 
a von Neumann triplet (Li (X), Or, wp) where state w, and automorphism Op are 
defined as in (2.48), respectively (2.47). 


2.2.2 Conditional Probabilities and Expectations 


Given a measure space (X, u) with o-algebra ©, a finite partition P = {Ply such 
that (P;) > 0 for alli, and X € &, consider the following function 


XNP) . 
x E€ X uU(XIP)&x) == —— if xeP,, (2.49) 
u(Pi) 
such that 1 du (x) u(XIP)(x) = (X A Pi). (2.50) 


i 


It is the conditional probability of X € & given the partition P and represents the 
probability of the subset X once it is known that x belongs to one of the atoms of 
P. This notion can be extended to the case of partitions with atoms P such that 
LCP) = 0 by assigning a same fixed, arbitrary real value to u(X|P)(x) when x € P: 
in such a way one gets a family of versions of the conditional probability each of 
which satisfies (2.50) [77]. One can extend (2.49) and (2.50) and define probability 
distributions conditioned upon o-subalgebras 7 C ©. 

Consider an integrable function f € LIX ), the functional on 7 defined by 


F(T) := | du(x) f(x), T € T, is bounded, o-additive and absolutely continuous 


T 
with respect to u (see Remarks 2.1.1.3 and 2.2.4.1); its Radon-Nikodym derivative 
Eo =: E(f|T)(x) such that 


[eww cui = [enw reo VTeT, (2.51) 


is J -measurable and integrable and is called the conditional expectation of f with 
respect to the o-algebra 7. 

By choosing as f the characteristic function Il x of a subset X € & its conditional 
probability given 7 is thus defined by u(X|7)(x) := E(x |T)(x) and is such that 


f dy (x) UXT) = W(XNT) VX EX, TET. (2.52) 
T 


Given a o-subalgebra 7 C E consider the Abelian von Neumann algebra 
a (X, T) consisting of the essentially bounded 7-measurable functions on ¥ 
(see Remark 2.2.3.2). This is a subalgebra of the Abelian von Neumann algebra 
is (X) of the X-measurable essentially bounded functions on X. Then, (2.51) makes 
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the conditional expectation a linear map E(-|7) : Lie (XY) PR La (X, T) which is 
linear, positive and a measure preserving projection, that is u o E( |T) = u and 
E(E(f|T)|7) = E(f IT). The first three assertions are evident, while idempotency 
is a corollary of the following more general property. Suppose 7; < 72 are two o- 


subalgebras of È such that T; € Ti ==> Tı € 72 but not vice versa, in general. Then, 
if fe L(x), (2.51) yields 


[ du (x) E(E(f\72)|T1) = [ du (x) E(f|T2) = [ du (x) f(x) 
= f OEST., 
for all T; € Ti, whence Ti < 72 => E(E(f|72)|Ti) = EFIT). 
Proposition 2.2.1 If f € L (X) and g € LX(X, T), where T C &, then 
Elg FIT) =g9E(fIT). (2.53) 


Proof Suppose g = 17, the characteristic function of a subset T € 7; then, for all 
To€T,T A To € T and (2.51) yields 


f du (x) tar fIN@) = f aes f rŒ EFNA). 
To TNTo To 


Then, one concludes the proof by approximating g € Lie (X, T) with respect to the 
essential norm by means of simple functions. 


Given a refining sequence {Jn }nez, thatisn < m => Tn © Tm C È, we shall set 
Tx := Vaneg the smallest o-subalgebra containing all the Tn’s (Tn t T+), respec- 
tively denote by T- := Anez Tn the largest o-subalgebra contained in all the 7, 
(Tn 4 T—). The proof of the following continuity properties can be found in [77, 123]. 


Theorem 2.2.1 Let X be a measure space equipped with a a-algebra X and a 
measure u; given a refining sequence of o-subalgebras Tn, then 


dim EIT) = ECPIT) lim EGIT) = ECT) 
forall f € Liv (X). 
Examples 2.2.4 


1. IFE D T := N, the trivial c-algebra consisting of the empty set Ø and the whole 
of X; then, E(f|N) = p(f) 1. On the other hand, if T = £, then E(f|X) = f. 
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2. By inserting characteristic functions f = ly, X € &, in the above theorem, one 
gets the following continuity properties of the conditional probabilities: 


im MX Tn) = UXT), we. 


for all% -measurable subsets of X. 

3. Consider the unit interval [0, 1) with the Borel o-algebra & and the Lebesgue 
measure dys (x) = dx ; construct the measure algebra 7,, generated by the partition 
Pn of [0, 1) into 2” atoms Py = [k2~", (k + 1)2~"). Then, 7n + È and [77] 


oral (k+1)2-* 
EFOD Orf O aso, 
k=0 


k27" 


forall f € Lio. 1) (dt ). Forn — œ the summand containing x tends to the deriva- 
tive of the integral at x and thus to f(x) p-a.e.. 


2.2.3 Dynamical Shifts and Classical Spin Chains 


Dynamical shifts and symbolic models can be given an algebraic formulation in 
terms of classical spin chains. Consider a triplet (25, Ts, p), that is a shift-dynamical 
system over doubly infinite sequences of symbols from an alphabet with p elements 
that leaves invariant a measure p. 


Let us associate to each symbol j € {1,2,..., p}a p x p matrix of the form 
000 >. OO 
000 >. OO 
Pj=|--- 1 
“SH 


(j,j)—thentry 
000 >. 00 
Varying 1 < j < p, we obtain an orthonormal family of orthogonal projections such 


that P; P; = Õij P; and Di P; = 1p, where Il, denotes the p x p identity matrix; 


these projectors generate the diagonal p x p matrix algebra D Ot with elements 


d 00-00 
0 d0---00 
P 
D= 000000 =) d;P}j. (2.54) 
anea 
0 00---0d, 


4 These commutative matrix algebras are also called Abelian and projections as the Pj are known 
as minimal projectors. 
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To each label j in a sequence i = {i;} jez one thus associates the diagonal matrix 
algebra D(C): each of its minimal projectors thus corresponds to a simple cylin- 


(0.21 as in (2.24), these 


der. Extending the construction to generic cylinders Cj, jj, 


correspond to tensor products of projectors 


n—1 


0,n—1 . de si i 
Po R) Py i oie (2.55) 
j=0 


Then, the natural matricial description that one associates to strings of length n 
is the diagonal matrix algebra D™® = D®” := QD p(C));, namely the tensor 
product of n copies of Dp (C) whose elements are diagonal p” x p” matrices of the 
form 


DP Y dahr. (2.56) 


im 
Mea” 


A suggestive physical picture is as follows: each matrix algebra D(C) describes 
a classical spin, with p possible states, in an infinite classical ferromagnet. Spins 
located at the lattice sites —n < j < n are described by tensor products of the form 
Di-n,n] := Q=- (P p(C));. These matrix algebras can be interpreted as algebras 
of observables for finite portions of the infinite ferromagnet by the embedding 
Di-n,n] b> I_»-y 8 Di-n,n] ® Winsa into Dæ := se Di-n,n]» where l-n-1] 
and ll{n+1 denote the tensor products of infinitely many identity matrices Il € Dp (C) 
located along the two-sided chain at sites from —oo up to —n — 1, respectively from 
n + 1 up to +00. 

Each Di-n,n] can be equipped with the standard sup-norm of matrix algebras (see 


(5.3)).> The sup-norm inductively extends to Dæ and allows to consider the uniform 
nif 
closure Dz := Unen Daa” om. This procedure is known as C*-inductive limit 


[80] as it involves an increasing sequence of local algebras Dj—n nj; Dz provides a 
C* algebraic description of a classical spin chain. 

Using (2.26), the left-shift along sequences gives rise to an algebraic shift map 
0, : Dz |> Dz such that 


Og(P[-n.n)) = Pi-ntin+1] - (2.57) 


Further, the local probability measures pp”) := {PE™)} <q” that yield the 
p 


global T,-invariant state u over 2% can be associated with diagonal matrices 


p= y per, (2.58) 


Mea 


5 The sup-norm of a diagonal matrix D is the square root of the largest diagonal element of D? D. 


2.2 Symbolic Dynamics 39 


by means of the trace operation (see (5.19)) which acting on any matrix returns the 
sum of its diagonal entries. In fact, multiplying pw with matrices as in (2.56), gives 
another matrix in D) with diagonal elements p(i) d(i™), and one gets 


Trp DM) = Z pe™aE™), 


IMeQh 


0,n—1] 


whence pi”) = Te( o% Po ). Therefore, conditions (2.29)—(2.30) translate 


into the following algebraic relations to be satisfied by { ae }neN! 
TO lS Tr P) So , (2.59) 


where Tr; denotes the trace with respect to j-th factor. These conditions allows to 
consistently define a global state w, on the spin chain Dz; this state is specified by its 
values as a positive expectation functional over local spin arrays where it coincides 
with the local states pe (which we shall encounter in the quantum setting as density 


matrices). 


Definition 2.2.5 (Classical Spin Chains) The C* algebraic triplet (Dz, Oc, wp) 
associated with a measure-theoretic triplet (2p, To, p) will be referred to as a clas- 
sical spin chain. 


Remarks 2.2.5 In Sect. 5.3.2, it will be proved that to all classical spin chains as 
defined above, there correspond algebraic triplets as in Definition 2.2.4. In particular, 
the von Neumann algebraic triplets arise when the C* triplets (Dz, Os, Wp) are 
represented on a Hilbert space and enlarged by adding to them their weak-limit 
points. 


Example 2.2.5 ([135]) Consider a Markov chain as in Example 2.1.5. Let 


pd) 0 PERE 0 

2 0 p 0 
p=) POP = e #4 
= 0 0 ++ pd) 


correspond to the probability measure u = { pL. Define on the tensor product 
Da(C) ® Da (©) a linear map E : Da (C) ® Da (C) —> Da(C) by linear extension 
of the following action on tensor products of minimal projectors P; € Da (C), 


P; @ Pj > ELP; & Pj] := pili) P; , 
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where P(j|i) are the transition probabilities of the Markov chain. From (2.34) and 
(2.35) and using that )~¢_, Py = I, 


d d d 
i= S EIAS P= + plyph=) hat 
i,j=1 ij=l i=l 


d d 
Te(PEL @ Pel) = J Tr(pELP; 8 Pel) = D> P&ID Tre Pi) 
=1 i=l 

d 
= YF pili) p@ = pK) = Tr(p Pk) , 


for all k = 1, 2,...,d. Furthermore, higher order probabilities as in (2.36) are iter- 
atively obtained as: 


Plipit ++ int) = Tr(pELPi, B ELP, © -++ ELP, @ 1111). 


For instance, to evaluate Tr(p {[ Pip ® EL Pi, 8 11) use Sa Pg = I, then 


d d 
J T(P ELPs 8 ELP, PAI) = D> P linIe (P ELP ® Pi]) 
k=1 


kal 


= Tr(pE[Pi, ® Pil) = pliiléo) Tr(p Pio) 
= p(ii|io) plo) = pot) . 


Therefore, the local density matrices a are the local restrictions w, ID™ of a global 
state w,, on the classical spin chain Dz such that, for all D; € Da(C), 


wy (Di ® D2 +++ Dy) = Tr( PELD: 8 EID: ® ++- ELD, @ 1]11) 


2.3  Ergodicity and Mixing 


The two uncoupled harmonic oscillators of Example 2.1.2 whose orbits fill the phase- 
space densely (see Remarks 2.1.2.2 and 2.1.2.3) are typical ergodic systems. Ergodic 
theory developed [200] from the attempt to explain why in thermodynamic sys- 
tems time-averages of typical observables coincide with their mean values u(f) 
with respect to equilibrium distributions. Intuitively, if an orbit fills the energy shell 
densely, evaluating the time-average of a function along such an orbit should indeed 
amount to integrating with respect to the Liouville measure restricted to the energy 
shell. 
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Definition 2.3.1 Let f : X — C be a complex function associated to a dynamical 
system (X, T, 4); time-averages are defined by 


| : ro s — ; 1 t 
Foo) = lim, 7 FT. Foo = lim if ds f(T,x) 


t—+00 f 
in discrete, respectively continuous time. 


Example 2.3.1 Consider the uncoupled oscillators of Example 2.1.2 and a contin- 
uous function f : T? +> R. By means of (2.12), the discrete and continuous time 
averages yield 


e eit Dini wini aey 1 SR 
F(0) = lim ), fm ean E finen(6), 
°° aez r(e Liat ni — 1) neZ2 
Fii winjE2nZ 
respectively 
_ ait 3 ;wiNi _ Pe 
FO) = lim ), aa enO@= J, flaen(@). 
O eD a pwin neZ2 
X? | winj=0 


Then, besides ensuring that orbits fill T? densely, the conditions in Remarks 2.1.2.2 
and 2.1.2.3 also imply that time-averages coincide with their mean values: f (0) = 


FO) = fro dO f0) = Wf). 


A considerable break-through in ergodic theory was Birkhoff’s theorem.® 


Theorem 2.3.1 Given (x, T, n); let fe LIX) be a complex u-summable func- 
tion on X. Then, 


1. the time-average f(x) exists [t-d.e. on X; 
2. the time-average f is T-invariant: fo oT = f p-a.e.; 
3. the time-average f €E Li (Xv) and uf) = (f). 


The proof [77,111] of these important results hinges on the following lemma 
known as maximal ergodic theorem. 


6 Though the results presented below can be extended to dynamical systems in continuous time, we 
shall concentrate on discrete time dynamical systems. 
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Lemma 2.3.1 Given the dynamical triplet (x, T, 11), for any f € Li), set 
k-1 

Si (x) = 5 f(Tix) and Af := [x E€ X : SUPk>0 5G) > o}; then, / du (x) 
: ~ Af 
j=0 

f(x) = 0. 

Proof Set aD (x) = max{0, si (x), ies sto] and split X into the subset Af i= 

[x : aD (x) > o] and its complement where P(x) = 0. Further, 6?) (x) = 


max | S/ (x), Seedy sie} = aP (x) on Af; also, 


01) (x) = max| FO), £0) FT), PAET fT] 
= f(x) + max|0, f(TX),..., F(Tx) +. f(T") 
= f(x) + OM (Tx). 


Thus, since aD (x) is non-negative, js is T-invariant and © (x) > pP (x), 


i , dH) F@) = J) ai (201 — PPT) 


> f du (x) 2a) = / du (OM (Tx) 
f x 


A n 


= f an% D(x) — / du (x) PPa) =0. 
A x 


n 


Then, the result follows for, when n —> +00, the points in Ap that are not in Af 
form a set of vanishingly small measure u. 


Proof of Theorem 2.3.1 Let a < b € R and, using the notations of the previous 
lemma, set 


j 1 
Ea = k eX: liminf —Sf(x) <a <b < lim sup — Si (x) 
n—>+o0 n n 


n—> +00 


Let g? (x) := f(x) — b when x € Eap, otherwise gR (x) = 0; consider the set 


1 og? lof 
[rea : sup —S;,“ (x) > 0} = [rea : sup —Si (x) > b} . 
nn nn 
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: ei x (1) i 5 
This set not only coincides with the set A%b as defined in Lemma 2.3.1, but it also 
equals Ea; while the first property is contained in the definition of the set, the second 
one follows from the fact that, on one hand, 


lim sup — | sf (x) >b=> ie X (x) > b => Ea C Adan í 
n—>+oo M n—>+oo0 M 
On the other hand, hy definition of Eap, if x ¢ Eap also T”x ¢ Eap for all n, whence 
g9 (T"x) = 0, S% (x) = O and x ¢ A% . Thus, Lemma 2.3.1 yields 
I: w du (x) 5p @) = f da (f(x) —b) = 0. 
AX Eab 


ab 


The same argument applied to the function g9 (x) :=a — f(x) if x € Eap, other- 
wise g$ (x) = 0, gives f du (x) (a — f (x)) = 0, whence 
Eab 
bulEa) < | duc) fx) < aE): 
Eab 
Since a < b this can only be possible if u(Eap) = 0; therefore, the limit f (x) = 


1 
lim -sf (x) exists u-a.e. on X. Namely, outside the union of all Eap with rational 
n—>+o n 


a, b, which is still a set of zero measure, the sequence sf l (x) converges pointwise to 
f(x) which can however be +00. 
The limit function is T-invariant by construction; moreover, f € L F (X). Indeed, 


1 
f awlisto]s f awra: 
x n x 
therefore, Fatou’s lemma’ yields 


[ ene foo = tim int f dy (x) sw 
X n—-+00 x n 


3) du (x) |f Œ)| < +0 . 
x 


1 
Finally, choose A > 0, let g\(x) := sf (x)| — A, and consider the set A%” as in 
n 


Lemma 2.3.1 ; then, 


lof A 
l du (x) |- Sa Œ) — f(x) 
xX n 


1 $ 
< f e s «= fe) 
X\ AIA n 


1 x 
+f du (x) Ew +f du (x) | ÊC] . 
AIX n AIX 


7 Fatou’s lemma [305] asserts that if f, is a sequence of measurable functions on a measure space 


X, then f du liminf fn <timint f du fn- 
x n—=>+00 n>+o~ Jx 
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I-a 
Consider the third integral, Lemma 2.3.1 yields u(A9^) < ` | fll; as to the second 
one, it can be estimated as follows 


l of 
/ du (x) |= S7 (x) 
AIX n 


IA 


n—1 
1 
os / du (x) [f(T x)| +a uA”) 
nif T »)>a 
= 1 ona POIO Haa, 


for some fixed a > 0. Now, w(A%) and thus the third integral can be made arbi- 
trarily small by choosing an appropriate À, as well as the second one by setting a 
large enough. Further, Lebesgue dominated convergence theorem,® can be applied to 


f du (x) 
X\AI 


whence 


1 A 
— sf (x) — f o) which becomes negligibly small when n — +00, 
n 


n—1 


1 du (x) f(x) = lim f dys (x) ~Sf(x) =_ lim 9 f du (x) f (TEx) 
X n=>+0 J% n NRR M x 


= I du (x) f(x). 
x 


In the light of Birkhoff’s ergodic theorem, we first give a general measure-theoretic 
definition of ergodicity and then consider its physical consequences. 


Definition 2.3.2 A dynamical system (X, T, u) is ergodic if for all measurable sub- 
sets T~!(B) = B implies p(B) = 0 or p(B) = 1. 


Remarks 2.3.1 


1. The first conclusion to be drawn from this definition is that ergodic systems 
cannot possess non-trivial T-invariant measurable functions (constants of the 
motion). Indeed, if f : ¥ —> R is such that f o T = f then Ng:={xEer: 
f(x) = a} C X is measurable; moreover, as x € TN) implies Tx € Ng, then 
f(x) = f(Tx) =a. Thus, T~!(Nq) © N, and ergodicity forces Nz to equal 
either ¥ or Ø u-a.e. for all a € R, whence f(x) = cf p-a.e. on ¥. 


8 Lebesgue Dominated Convergence Theorem [305] asserts that if { fn }nen is a sequence of measur- 
able functions on ¥ such that the limit f(x) = lim fa(x) exists forallx € XY and | fa (x)| < g(x) 
n—>+00 


for all x € X with g € L! (¥), then f € L! (¥)and | du(x) f(x) = lim du (x) fax). 
H H x n>+0 J x 
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2. If fe LI, (¥), its time-average f is T-invariant by point 2 in Birkhoff’s theorem. 
If (¥, T, p) is ergodic, from the previous remark and point 3 in Birkhoff’s theo- 
rem, f(x) = CFU ae; thus, w(f) = cF= f(x) u — a.e. on X. Namely, ergod- 
icity implies that time-averages and phase-averages (mean-values) of 
(summable) observables coincide. Vice versa, dynamical systems where time- 
averages and phase-averages coincide are ergodic because of Proposition 2.3.1 
below. 

3. If the only T-invariant measurable functions are constant almost everywhere on 
X, then (X, T, p) is ergodic: in fact, the characteristic functions of T-invariant 
measurable subsets are T-invariant and must then be constant almost everywhere, 
namely equal either to 0 or to 1 j1-a.e. 

4. The average time spent within B by almost all phase-points of an ergodic system 

t-1 
equals the volume of B. Indeed, let Tp (x) := A X 1g(T°x);, count the mean 


s=0 
number of times B is crossed by the trajectory {T”x}nen during a span of time 
of length t. Then, ergodicity yields 


Tz (x) = [Jim T(x) =u(B) p-a.e.. (2.60) 


Proposition 2.3.1 A dynamical system (X, T, p) is ergodic if and only if for all 
measurable A, B it holds that 


t—1 


lim + (ANT *(B)) = u(A)u(B) (2.61) 
Pa OD | 


Proof Consider 14 (x)Ip (x) = I > lanr-s(B)(x), by Birkhoff’s theorem and 
Lebesgue dominated convergence ieee (see Footnote 8) it follows that 
— i 
wats) = lim =) (ANT (B)). 
s=0 


If the system is ergodic, then (2.61) follows from (2.60). If (2.61) holds, then A = 
B=T~'(B) = p(B) = p(B), whence p(B) equals either 0 or 1. 


Definition 2.3.3 (Mixing) A dynamical system (X, T, u) is mixing if and only if 
for all measurable subsets A, B C ¥ it holds that 


jim WAN T'(B)) = w(A)uCB) . (2.62) 
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The subsets A N T~‘(B) consist of those points of A that visit B at time t; thus, 
(2.62) asserts that relative to the volume of any measurable subset A, the volume of 
points of A that will eventually be in another measurable subset B equals the volume 
of B. In other words, mixing dynamical systems are in the long run characterized by 
the uniform spreading of their measurable subsets; on the other hand (2.61) states 
that ergodicity amounts to a uniform spreading on average. 

From a physical point of view, quantities like (A N T~‘(B)) are two-point 
correlation functions, thus, mixing characterizes dynamical systems whose two- 
point correlation functions factorize asymptotically, whereas ergodicity corresponds 
to two-point correlation functions factorizing in the mean. 


Remarks 2.3.2 
n—-1 
1. If lim an = 4,a, E R, then, im a = a, whence (2.62) implies (2.61) 
nz +00 


and mixing implies saci: The opposite is not in general true as the time- 
average can get rid of those s for which u(A N T7S(B)) 4 u(A)u(B). There is 
a third asymptotic behavior, intermediate between ergodicity and mixing, known 
as weak mixing [370] and related to the fact that 


n=1 
lim |an — a| = 0 = lim -Y la, -a| = 0 
noo n>œ n £ A 
J= 
1 n—1 
= X a-a) =0. 
j=0 
Weak mixing amounts to the request that 
ti 
lim — 2 |u(A N T™(B) — u(A)u(B))| = 0; (2.63) 
t>ow f 


it is implied by mixing and implies ergodicity. 

2. Given an invertible map T, a stronger notion of mixing is formulated as follows 
[111]. Given any finite collection S, := {S;};_,, S; € &, of measurable subsets 
of X, denote by 


ES) = V T™ S») 


k>n 


the c-algebra generated by all possible atoms of the form T~*(S j) fork > n and 
Sj €S,. Then, (X, T, u) is said to be K-mixing if 


lim sup (So N B) — p(So) u(B) = 0, (2.64) 
n—> œ Bex(S;) 
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forall So, S, € X. Observe that x € X? (S,) implies T—*x € Si atsometimek > 
n for some atom S; € S,; therefore, K -mixing amounts to the uniform statistical 
independence of any given measurable subset from the trajectories of any finite 
family of subsets if these are considered sufficiently far away in the past. 


By using the density of the algebra of simple functions G(¥) in the Hilbert 
space Li, (X) as in Example 2.1.1, itis convenient to reformulate (2.61) and (2.62) in 
terms of square-summable functions. It is thus possible to study how those properties 
constrain the spectrum of the Koopman operator Ur. 


Proposition 2.3.2 A dynamical system (X, T, p) is 
1. ergodic if and only if for all , 6 € L? (X) 


t—1 


1 
jim, — 2 uh oT’) = MUO) ; (2.65) 


2. mixing if and only if for all Y, ġ € L? (¥) 


L 


dim u% go T') = wh) ul) . (2.66) 


t 


According to the Koopman-von Neumann formalism of Example 2.1.1, using (2.5), 
it turns out that (wd o T') = (4* | U} |b). Therefore, ergodicity and mixing can 
conveniently be expressed as weak-limits, that is as limits with respect to the weak 
topology (see Remark 2.2.3.3). Then (2.65) and (2.66) are equivalent to 


t=] 


ad P 
w— lim -9 U7 =]|1)(1] (2.67) 
s=0 
w— lim US =]|1)(1]. (2.68) 
[> co 


The constant function | Il ) is such that Ur | 1) = | 1); if there exists | Y ) Æ | 11) 
such that Ur| Y) = |w), then one can orthogonally decompose | wv) = a| 1) + 
Plo) with (d| 1) = 9, |||] = 1 and Ur| ¢) =|¢). Thus, 


1= lim (¢|Up id) A (ol)? =O, 


whence (2.68) cannot hold. If (2.68) holds, a similar argument excludes the presence 
of eigenvectors | y) ) such that Ur | Y) ) = ef^] Y) ). Therefore, 


Proposition 2.3.3 A dynamical system (*,T, p) is mixing only if 1 is the only 
eigenvalue of its Koopman operator and it is not degenerate. 
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In order to see the impact of ergodicity as expressed by (2.67) on the spectrum of 
Ur, we use [370] 


Proposition 2.3.4 (von Neumann Ergodic Theorem) Let Ur be the unitary Koop- 
man operator acting on the Hilbert space H := L? (X) of a dynamical triplet 
=} 
1 
(X, T, u), withT invertible. Let A; : H t+ H be defined by A;| Y ) := 7 > Ur|), 
s=0 
w c H, and let P project onto the subspace K of vectors such that Ur| 4%) = 
|Y). Then, me |(A; — P)w|| = 0; in other words, P is the strong limit (see 
—> +00 


Remark 2.2.3.1) of the sequence of operators A;, P = s — lim;-++00 At. 


Proof The subspace orthogonal to K is (Ur — 11)H; thus, for any y € H, 


ly) = PIY) +A- Py) = PIY) + Ur- Iio), 


t 
— l 
for some ¢ € H. Since A;(Ur — 1) = 4 — the result follows from 


Ip = DIe) _ 2il l 


t t 


I — PIY) s 


Corollary 2.3.1 A dynamical system (&,T , u) is ergodic if and only if 1 is a non- 
degenerate eigenvalue of the Koopman operator Ur. 


Proof Since strong convergence implies weak convergence, condition (2.67) means 
that ergodicity is equivalent to P = | 1 )( 11 |. 


Remarks 2.3.3 


1. By substituting Y, ¢ € LZ (X) with Y% — p(y) and ¢ — u(¢ġ) (they also belong to 
Li, (¥V)), ergodicity, respectively mixing amount to 


l pio : l l 
aa EEN „Jim mb eoT') =0, (2.69) 
s= 


for all Y, ọ € LÈ (X) with u(y) = u(ġ) = 0. 

2. In case T is not invertible, the Koopman operator is not unitary, but just an 
isometry, that is U'U = 1, while UU" Æ 11. If T is invertible, time averages can 
be extended from —oo to +00 and, because of T-invariance of ju, it does not 
matter which one of ~ and ¢ is the time-evolving function in (2.65) and (2.66). 
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Examples 2.3.2 


1. Ergodic rotations as in Example 2.1.2 are never mixing for the Koopman operator 
has the exponential functions e, (0) as eigenfunctions. 

2. The system in Example 2.1.3 is mixing. Given w, d € L „T> with u(y) = 
u($) = 0, lete > Oand choose L > Osuch that |Y — Yell < cand ld — dell < €, 
where Ys = Š jmi YmM)em and @- = Valet Emen. Then, using (2.22), 


[Wb G0 TP)| < e doll + lo + [We be o T?)| 
<eldel +l + X PIPA ?n)]. 


la| <L 
ATPa] <L 


When p — œ, hyperbolicity permits ||n|| < L and ||A7Pn]|| < L only if n = 0; 
since u(ġ) = 0) = 0, the Arnold Cat Map satisfies (2.66). 

3. Conditions for the Markov chain in Example 2.1.5 to be mixing can be derived 
by considering two-point correlation functions involving cylinders, of the form 


cler- 1] [0,g-ll,\ _ [0,p—1] [t,t+q—1] 
n(c ig-ip— (lee (Ci “Iq-1 )) ce (ola i en): 


By means of the matrix P = [p(i|j)] of transition probabilities and of (2.36), 
one writes 


[0,p—1] [t,t+g—1] er Ses 
Osa Cig in j= >. P(io:--ip—1kp +++ Kt-1Jo-++ Jg-1) 


kpr -ki-i 
d t—2 
= > ( Puel? Jokr-1 > lI Prysiky | X 
kp,ki-1=1 a=0 Kptiski-2 b=p 
p—2 
x Pkpip-i ( I] Pee) Pio) 
c=0 
d 


= >}, Ag Pien) E Jokr— ı (P ar ee ikp Pri tp—1 plio: "ip—1) ` 
— 


kp,k 1 a=0 
pant MCig--i,_1) 


Using (2.34) and (2.35), it follows that (see [77,370]) 
COe-H t,t+q—1] [0, p—1] [0,4—1] 
lim w(c ig: ip- E. ‘Iq- 1 l= aora ne.) ý 


is achieved if and only if umi (Pe = p(i) for all j = 1,2,...,d; while 
t—>+00 


factorization in the mean (and thus ergodicity) holds if and only if, for all 
j=1,2,...,d, 


50 2 Classical Dynamics and Ergodic Theory 


t—1 
Qij = lim- Le ig = PW- 
4. The condition for mixing in the previous aac is certainly satisfied by Bernoulli 
dynamical systems whose matrix of transition probabilities is P = [ PON ja 
(see Example 2.1.5). 


2.3.1 K-systems 


Consider an invertible Bernoulli system (Q, To, 1), where the space of p-adic dou- 


p’ 
bly infinite sequences i € Q% i is equipped with the o-algebra generated by cylinder 
sets and js is a translation invariant product measure on X. Let Cio, = {C TE 1 
be the finite partition consisting of simple cylinders as in (2.25) and consider ‘the 
o-algebras 


Co = V T7’ Coy) = V Cui (2.70) 
j20 jz0 
Ci = T} Co) = V Cu (2.71) 
jz-n 


generated by union and intersections of cylinders of the form (see (2.26)) 


: —pt+l 
clea =f} T7 e a ia- PH) — ipi ipi ig E Q9 P ) 


i@-p+1) T 
j=p 


for any q > p > —n. From Sect. 2.1.2 and Examples 2.3.2.3-2.3.2.4, one deduces 
that: 


i) C C Coo Gis, iD NCN, (2.72) 


n>0 n>0 


where N is the trivial c-algebra consisting only of the empty set Ø and of QZ, all 
equalities being understood up to sets of zero measure. Condition ii) expresses the 
fact that cylinder sets Cip, 4 with p, q € Z generate X, while in condition iii), 


Ac-a=/A VT’ Co) 


n>0 n=O j=n 


denotes the largest o-algebra, called tail of Cio} (Tail (Cio) contained in all C-n] 
with n > 0. 
Cylinders in C_y are of the form C; 7 P with p > n, q > 0; they become sub- 


sets of Tail (Cio i) when t — +00. Then, from the mixing relation in Remark 2.3.2.3, 
one deduces that the characteristic functions of these atoms go into the constant 
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functions uC ay) ll, asymptotically, whence condition (ii). Bernoulli shifts are 
particular instances of Kolmogorov (K-)systems [111] and Co) a particular example 
of K -partition. 


Definition 2.3.4 (Classical K -systems) A discrete-time dynamical system (X, T, p) 
with o-algebra X is a K-system if there exists a o-subalgebra (a so-called K- 
partition) Xo C È that gives rise to a nested K-sequence of o-subalgebras X; := 
T‘ (Xo) such that 


1. ©; := T'(Xo) C E41 for all t € Z; 


2. Viez Xr = X; 
Sy NMrez Ly = N. 
For Bernoulli shifts, the partition Ci) is such that V,ez Ts (Cto) = E and 


Tail (Cio) = N: Cio is a generating partition with a trivial tail. 


Definition 2.3.5 Let (X, T, u) a measure theoretic dynamical triplet with £ as ø- 
algebra. 


1. A finite, measurable partition P of ¥ is called generating if (apart from sets of 
zero measure ju) 


+00 ; +00 l 
V T/(P) = (T invertible) or V T/(P) = © (otherwise). 
j=-0o j=0 


2. The tail of a finite measurable partition P is defined by 


Tail (P) := N V TP) (2.73) 


n>Ok>n 


and will be said to be trivial if Tail (P) = N, that is if all its subsets equal Ø or Y 
up to sets of zero measure u. 


Remarks 2.3.4 A generating partition P = {P; } ez consists of U-measurable atoms 
P; such that unions and intersections of their images T” (P;), in the past, n < 0, and 
in the future, n > 0, generate X. Instead, the refinements P-n] := ples je (P) 
are the o-subalgebras generated by the atoms in the past of P up to a discrete time 
t = —n. Since P_p—1] © P-n], one can also loosely write Tail (P) = ee P-n] 


to indicate that the tail of P contains all measurable subsets generated by the remote 
past of P. As such, tails are T-invariant. 
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From the preceding discussion concerning Bernoulli shifts, there clearly appears 
a relation between the triviality of the tails of partitions and the dynamical system 
mixing properties. 


Proposition 2.3.5 A dynamical system (X, T, u) is K -mixing (see Remark 2.3.2.2) 
if and only if all its finite partitions have trivial tails. 


Proof Consider a finite partition P and its tail. By definition, Tail (P) is mapped into 
itself by T ; thus, if in (2.64) So € Tail (P), then So belongs to P-n] := Vjen TI (P) 
for all n and thus to the o-algebra £ (P), generated by P-n]. Therefore, one 
can choose B = So in (2.64) which then yields (So O So) = (So)? and, in turn, 
Tail (P) = N. 

Vice versa, let us choose as S, in (2.64) a finite partition P? and consider the 
o-subalgebras P-n] C E generated by the infinite refinements Vz>, T(P). The 
corresponding conditional probabilities u(S|P—n]) (x), S € & (see (2.49) and (2.50)) 
are such that, for any Ag € & and B € P-n], 


MA0 N B)(x) — jx(Ao)u(B)| < [ dys (x) |w(AolP—n)(x) = (Ao)] 


Because of Theorem 2.2.1 and Examples 2.2.4.1,3, from P-n] | M it follows that 
L(Ao|P_nj) (x) > u(Ao) p-a.e. when n —> oo, whence K-mixing follows from 
Lebesgue dominated convergence theorem. 


In the next chapter, by using entropic tools, we shall show that all finite partitions 
of K-systems have trivial tails and are thus K-mixing; at this point it suffices to 
observe that 


Proposition 2.3.6 Ifa dynamical triplet (X, T , p) has a generating partition P with 
trivial tail, Tail (P) = N, then it is a K-system. 


Proof The o-algebra Po) := V n>0 T~"(P) is a K-partition. 
Examples 2.3.3 


1. As for Bernoulli shifts, also for the Markov shifts in Example 2.1.5, the partition 
P = {Poi} consisting of simple cylinders as in (2.25) satisfies condition i) and 
ii) in (2.72). However, the argument used when discussing the triviality of the 
tail for Bernoulli shifts shows that Tail (P) = M if and only if the Markov shifts 
are mixing in which case by Proposition 2.3.6 they are K-systems. 


? Starting from the finite set S, of measurable subsets { Si}; one constructs the partition of X 
consisting of Sọ := (\j=; Si. S; = Si\Sp and Sla = X\ Uj S}. 
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2. Consider Example 2.1.2 with the frequencies w1,2 such that the system is ergodic 
(see Remarks 2.1.2.2 and 2.1.2.3). As a partition of TÊ, choose the Cartesian 
product C of the partitions of the 1-dimensional torus T into atoms C4 = {0 < 0 < 
T}, C2 = {m < 0 < 27}. Because of ergodicity, the trajectories of the end points 
of the atoms C; x C; fill T? densely and the intersections of their images T*(C; x 
C j) under the dynamics become finer and finer and approximate better and better 
the Borel c-algebras of T?. Actually, this already occurs if one restricts to T~/ (C) 
with j > 0, namely viz T~J(C) = X. This also means that Tail (C) = X. 

3. The partition in Example 2.2.2 of the two-torus into the vertical half-rectangles 
gives thinner and thinner vertical rectangles while moving into the past, and 
thinner and thinner horizontal rectangles into the future. Their intersections are 
squares of increasingly small side, by means of which one can approximate better 
and better every Borel subset of T?. The tail of such a partition is trivial due to 
the fact that the Baker map acts as a Bernoulli shift with respect to it. 


In order to set the ground for a quantum extension of the notion of K-system 
(see Sect. 7.4.4), we operate a reformulation of the conditions in Definition (2.3.4) 
in terms of algebras of functions. Given a K-sequence {; := T'(Xo)}iez of o- 
subalgebras, consider the Abelian von Neumann subalgebras M; := Lr (Vv, d+) = 
O7IMo] consisting of the essentially bounded X;-measurable functions on ¥ (see 
Sect. 2.2.2). Then, one has 


Mic Mitt, YVMi=M, Mn = {Al}, 
teZ, 


where the generation of M by \/ is by strong-operator closure on the Hilbert space 
Li, (¥), while /\ denotes set-theoretic intersection. 

We shall see in Sect. 5.3.2 that unital Abelian von Neumann algebras M can 
always be identified with suitable Lr (X) and represented as multiplication operators 
on the Hilbert space L? (¥). It thus makes sense to provide an algebraic reformulation 
of Definition 2.3.4 (see Definition 2.2.4). 


Definition 2.3.6 (Classical Algebraic K-Systems) A classical von Neumann alge- 
braic triplet (M, Or, w) is an algebraic K -system, if there exists a von Neumann 
subalgebra No C M such that, setting N, := ©7 (No), t € Z, 


1. N; C Naşı for all t € Z; 
2. Viez N: = M; 
3. Anez Ni = {AL}. 


Any such sequence {N;};ez of von Neumann subalgebras of M will be called a 
classical K-sequence. 
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Remarks 2.3.5 The above definition can also be formulated in an Abelian C* alge- 
braic context; there, M will be a C* algebra as well as the subalgebras of the 
K-sequence and V,ez M; will denote the algebra generated by norm closure. The 
classical spin chains discussed in Sect. 2.2.3 are instances of classical C* K -systems: 
with M = Dz, No will be the left half-spin chain Doj generated by the diagonal 
matrix algebras Dip,4] with p < q < 0. Then, the algebraic K-sequence will consist 
of the subalgebras Dy = ©, (Do) generated by the diagonal matrix algebras D[ p,q] 
with p <q <t. 


Given a measure-theoretic K-system with a K -sequence of o-algebras {Xin }nez, 
instead of considering the von Neumann algebras M,,, one may focus upon the 
Hilbert spaces H, := L2 (X, Xn) of square-summable X,„-measurable functions on 
X. From the conditions (i), (ii) and (iii) in Definition 2.3.4, it follows that 


1. H; C H;+ı for all t € Z; 
2. Urez H = H; 
3. Mrez H, => C ll, 


where H := Li, (X) and C 1I stands for the Hilbert space consisting of constant func- 
tions on ¥ (1-a.e.). Since, according to the construction of the unitary Koopman oper- 
ator in (2.4), ITs) (x) = 15,(T~!x) = (Up'15))(x), it follows that H, = U;' Ho; 
whence, setting K; := H;+1 © Hy, 


t#s=>K,1K,, H=QK,. (2.74) 
teZ 


By choosing an orthonormal basis {| f; )}jey in Ko, one gets an orthonormal 
basis for H; of the form {|e;,;) := Us | fj )}jez and thus one for H of the form 
{lejt )} jeJ tez- Any unitary operator U on a separable Hilbert space H which gen- 
erates an orthonormal basis of the previous form is said to have a Lebesgue spectrum 
of multiplicity J. 


Proposition 2.3.7 ([111]) Fora K-system (X,T, u), J is countably infinite. 


Proof Since Ho C H; there surely exists f € Hı with g := f — E(f|Xo), where 
E(f |) is the conditional expectation of f with respect to Xo, such that E(|g|*|Xo) 
Æ 0 on a Xo-measurable subset So with (So) > 0. Consider the function G € Hı 
defined by G(x) := g) 


v E(lgl? IZo) 


expectation it follows that 


15, (x); from the properties of the conditional 


E(g|Xo)(x) 
VE (\gi?|Z0)(x) 
E(\g|7|Zo) (x) 
E(\gl7|Zo)(x) 


E(G|Xo)(x) = 1s,(x) =90 (*) 


E(|G|*|Zo)(x) = 1s) (x) =1sy(x) Gee) . 
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Let {| e? )}een be an orthonormal basis in the Hilbert space L? (So) of square 
summable functions supported within Sọ and set | fL) — Mcle?) where Mg 
denotes the multiplication by G, namely f} (x) = G(x) e? (x). Notice that | f2) € 
H1; further, by using (2.51) and (2.53), it follows that | fE ) L Ho, whence | i) € 
Ko = Hı © Ho as defined in (2.74). Indeed, let | ho ) € Ho, then (*) yields 
wl = f woog f ao Ea hzo) 
So So 
= f du (x) hi (x) E(G|Zo)(a) La) =0. 
So 
Also, the set {| i )}neN is an orthonormal basis for 
lat | du (x) (e})* Œ) GŒ)? eR (x) 
So 
= f anw E(t Pezo) 
So ` 


= Í dy (x) (€})* (x) E(IG|?|Zo) E) RE) = (eG lee) = fjr - 
0 


Therefore, IKK; must be an infinite dimensional separable Hilbert space. 


2.3.2 Ergodicity and Convexity 


We conclude this section by considering some aspects of ergodicity and mixing in 
relation to continuous dynamics on compact, metric spaces and to the convex space 
M(X,T) of their regular, T-invariant Borel measures. The first result [370] states 
that ergodic measures are extremal in M (X, T), namely they cannot be decomposed 
into convex combinations of other measures in M(X, T). We shall make use of the 
algebraic setting of Definition 2.2.4. 


Proposition 2.3.8 An algebraic triplet (C(¥), Or,w,) is ergodic if and only 
if M(X,T) 3 wy = Aw, + 1 — A)un, 0 < A < 1, w1,2 E€ M(X, T) implies wy, = 
W],2. 


Proof Suppose a T-invariant Borel measurable subset E exists such that 0 < 
(E) < 1 and let E° := X\ E. With Ig and Il ge their characteristic functions and 
wu(l g) = (E), the two states 


wy (le f) wy ze f) 
u(E) 1 — (E) 


are different and both in M(X, T); furthermore, they decompose wp. for wp = 
ME) w, + (1 — u(E)) we. 


CX) > f > wf) = » C(4)3 f > wf) = 
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Suppose w, can be decomposed as stated in the proposition, then the mea- 
sure tj E M(X, T) corresponding to w; is absolutely continuous with respect to 
p (see Remark 2.2.4.2). Let fı (x) > 0 be its Radon-Nikodym derivative. Consider 
the measurable subset E = {x € X : f(x) < 1}. Observe that one can decompose 
E=(ENT7\(E)) U (E\ TT} (E)) by means of disjoint subsets and, analogously, 
T(E) = (T7! (E)NA E) U (TT! (E)\ E). As po T7! = p, 


wi (Iz) =j du(x) fix) + i du(x) fix) = wi (I p-1(z)) 
ENT- (E) E\ T- (E) 
=l du(x) fix) + / du(x) fit). 
T-1(E)N E T(E)\ E 


Therefore, as fı < 1 on E while fı > 1 outside it, it follows that 


KEAT E> f du(x) AG) = I 
E\ T-I (E) 


du(x) fi) 
T-\(E)\ E 


> (T7! (E)\ E). 
Then, u(E\ T~!(E)) = (T7! (E)\ E) = 0 since 


(E) = WEN T7! (E)) + (E\ T7! (E)) 
= (T7! (E)) = (T7! (E)N E) + p(T! (E)\ E). 


Thus, F is T-invariant apart from sets of 0 measure ju; if the system is ergodic, this 
implies either w(E) = 0 or u(E) = 1. The latter equality cannot hold, otherwise 
1 = w (1) =a) (le) < u(E) = 1; thus, w(E) = 0. The same argument applied to 
F:={xeX: f(x) > 1}, leads to u(F) = 0 whence to fi (x) = 1 p-ae. on ¥ 
which implies w,, = w4 and thus extremality. 


The second result is a refinement [370] of Proposition 2.3.2. 


Proposition 2.3.9 The triplet (C(¥), Or, wu) is ergodic, respectively mixing if and 
only if, forall f € C(X) and g € Li, (¥), 


4 


1 t—l 
lim = 9 wul 0 Of 9) = wul feu) (2.75) 
s=0 


t> 


lim wy(f 0 OF 9) = wu Pwu) - (2.76) 


Proof If (2.75) holds, it implies (2.65) by the fact that any % € L? (X ) is also 
summable and can be approximated in L? (X) by continuous functions. Vice versa 
(2.65) implies (2.75) as any f € C(¥) also belongs to LZ (X) and any f € LI (X) 
can be approximated in L l (X) by square-summable functions. The same consider- 
ations can be used to prove that (2.76) is equivalent to (2.66). 
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Example 2.3.4 Given the algebraic triplet (C(V), OT, wu), let v be another state 
on C (X) absolutely continuous with respect to w,,, but not @7-invariant, that is, for 
all f € C(X), 


Vf) = f oww, vn) = fants) w = aya) = 1, 


with Radon-Nikodym derivative gy Æ gv o Or € Li (X ). From a physical point 
of view, w, can be considered as a perturbation of the equilibrium state w,,. By 
duality (see (2.8)), for all f € C(V), %(f) = Vf o 07) where v; := vo OF’. If 
(C(¥), Or, wp) is mixing, then (2.76) implies 


jim nS) = eu f) wulG) = wal f), Vf Ee C(x). 


Physical instances of measures that are absolutely continuous with respect to an 
invariant one are local perturbations of equilibrium states; then, being mixing guar- 
antees that these perturbations fade away in time and provides a mathematical expla- 
nation of relaxation to equilibrium. 


2.4 Information and Entropy 


At its simplest, information theory is concerned with the description of two parties 
transmitting information to each other. Information is physical as it is encoded into 
physical carriers, e.g. electromagnetic waves, that undergo physical processes, e.g. 
interactions with an optical fiber. As long as the laws that describe these processes 
are those of classical physics, one talks of classical information theory. 


2.4.1 Transmission Channels 


In the following, we shall consider two parties A and B exchanging signals according 
to the following typical scheme (see Fig. 2.2): 


1. At each use, a classical source emits symbols i from an alphabet consisting, 
say, of integers J4 = {1, 2,..., a}: symbols are emitted with probabilities 74 = 
{pal}. 

2. After £ successive uses of the source, the source outputs are strings of length £, 
iO := iji2---ig € Q, emitted with probabilities p 4. Gi), These strings can 
be interpreted as outcomes of a random variable A® := VÉ ı Ai whichis the join 
of £ successive random variables from the stochastic process {A; };en (A1 := A) 
associated with countably successive uses of the source. The random variable 
A is distributed according to the probabilities 7 AO = { pawo (i ®) } © <Q® that 
£ subsequent uses have actually emitted a given string of symbols. ° 
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3. The sender encodes the emitted strings i ®© 


into strings of fixed length n, 
x = xX Xn E o0, consisting of symbols x; from another alphabet Jy := 


{1,2,...d}. The encoding procedure amounts to a map €” : Qf = 2%, 
EM : QO 511s EMGO) CM, (2.77) 


4. The code-words x are then sent to a receiver via a transmission channel, 
C™ =C x C x ---C, which transforms an input string x into an output string 
y™ =C (x) = yiye yn € 2%, consisting of symbols y; from, possibly, 
another alphabet Iy := {1, 2, ... , k}, according to a set of transition probabilities 
(compare Example 2.1.5) 


yeg” 


These latter quantities take into account the possibility that the transmission chan- 
nel be noisy and thus might randomly associate different outputs y% to a same 
input x”. 

5. The channel inputs and outputs are thus random variables X and Y“ with 
outcomes x and y™), If the code-words x) occur with input probabilities 
Ty) = {Pxm (X w, megt» the transition probabilities provide joint probabil- 


ity distributions for the joint random variables X v ¥ given by 


Tymyyn) = [pxovro (x, y™) h a, y™) E€ QP x ow] 
Pxovyo (2, y™) = pyw (2) p(y |x). (2.79) 


Consequently the output random variables Y are distributed according to the 
marginal probability distributions Tym = {Pym (yí in) Where 


) 
H) weg 
Pyn(¥™) = J pymyym(e™, y™). (2.80) 


xe” 


6. At the receiving end of the transmission channel, the output string y” goes 
through a decoding procedure whose aim is to retrieve the actual source output 
i that has been encoded into x” = €™ (i) from the received string y™ = 
c™ (x), Decoding amounts to a map 


~(£ 
2 > y® 6 DG) = iO ege. (2.81) 
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Fig. 2.2 Classical 
transmission channel A 
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The whole procedure comprises the following steps 


w EY yw ym 


c Q2 E QP ER” € QO 


D® 3(6) 
— l 


The efficiency of the transmission is related to how much the decoded word 9 
differs from the word i that, after being encoded into x, has been sent through 
the noisy channel and received as y™. The task is thus to minimize decoding errors 
while keeping a non-vanishing number of bits transmitted per use of the channel. 

In Sect. 3.2.2, we shall consider the class of memoryless channels without feed- 
back; they act on input symbols in a way which is statistically independent from 
previous inputs and outputs. As such, they are completely specified by factorized 
transition probabilities: 


po |x) = | | oils). (2.82) 
i=] 


Examples 2.4.1 (Channels) 


1. Noiseless Binary Channel: two classical bits (bits) 0, 1 are emitted with proba- 
bilities pa (0), pa (1) and sent through a noiseless channel: 


PO!) = p|1)=1, pOll) = p0) =0. 


2. Binary Symmetric Channel: two bits are emitted with probabilities p 4 (0), pa (1) 
and sent through a channel which flips them according to 


PO|O) = p|1)=1-p>0, pl) = pdl)=p>o0. 
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3. Binary Erasure Channel: two bits are emitted with probabilities p4(0), pa(1) 
and sent through a channel which does not flip them, but may erase anyone of 
them with a same probability 0 < a < 1. This action is described by a map C from 
the two-letter alphabet {0, 1} onto the three-symbol alphabet {0, 1, 2}, where 2 
stays for a junk symbol, and by transition probabilities 


PO|O) = pall) =1—-a, p|0) = p|) =a. 


A noiseless channel is characterized by transition probabilities that equal 1 in 
correspondence to specific pairs of input and output strings otherwise they vanish. 
There are then no distortions in transmitting or storing information by means of 
these channels. In such cases, the question is whether the source information can 
be compressed and retrieved with negligible probability of error, the possibility of 
compression depending upon the presence of redundancies and regularities in the 
source. 

More precisely, if the source emits binary strings of length n, one asks (1) whether 
for each bit from the source one can store h < 1 bits , still being able to reliably 
reconstruct the information emitted by the source from the 2’*” bit strings effectively 
retained and (2) which is the optimal compression rate h achievable. This problem 
is addressed by Shannon’s first theorem which asserts that, for stationary sources, 
the optimal rate is the their entropy-rate. 

In the presence of noise in the transmission channel, the strategy is somehow the 
reverse with respect to noiseless transmission; in order to reduce the possibility of 
noise-induced errors, one introduces redundancies by multiple uses of the channel. 
The aim is to optimize the number of signals that can faithfully be transmitted by n 
uses of the channel. Shannon’s second theorem proves that this number can be made 
increase exponentially with n at an optimal rate R, the channel capacity. 


2.4.2 Stationary Information Sources 
In most informational contexts, stationary sources are a reasonable description of 
the actual physical processes taking place. Stationarity means that the probability 


of a string i does not depend on when the source had emitted it, but only on the 
letters emitted. This is equivalent to (compare (2.30)) 


X paw (isin: ++ ie) = pau (i2i3 +++ ie). 
iy 


This condition goes together with the fact that the probability of a string of length 
£ — 1 must be the sum of the probabilities of all words of length £ with the same first 
£ — 1 symbols (compare (2.29)), 


pao (iiiz: ++i) = Paw-v (iiiz ++ ig—1). 
ig 
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The similarities with shift dynamical systems now are apparent. 


Lemma 2.4.1 A classical, stationary information source corresponds to a stationary 
stochastic process {Aj}icn, where the random variables A; take on values in an 
alphabet I4 = {1,2,..., a} and the joint random variables A” := Vai Aj are 
distributed with probability distributions 7 m satisfying appropriate compatibility 
and stationarity conditions. 

Equivalently, a stationary classical source can be described by the measure- 
theoretic triplet (Qa, To, pa), where [LA = [LA © pot is a state on the set Qa of semi- 
infinite strings i = {ij} jen, tj € Ta, equipped with the left shift T,. The restrictions 

ae of ua to the sets of finite strings coh ) are given by the probability distributions 
_ 
Finally, a stationary source can be described as a C* triplet (D4, Og, YA) as 


in Definition 2.2.5, namely by a semi-infinite classical spin-chain consisting of a 
n-1 

lattice of a-valued spins described, locally, by tensor products DY = ®) Da(C) of 
j=0 

diagonal a x a matrix algebras Da(C), equipped with an automorphism ©, which 

amounts to the left shift along the chain and with a ©,-invariant state Y4 such that 

(compare (2.56)) 


Wa = YI pam) Pron 


im eR” 
Examples 2.4.2 


1. Bernoulli Sources (see Example 2.1.4): the probabilities of strings are products of 


the probabilities of their symbols, p 4o) Ga) = rl Pa(i;), that are statistically 
j=l 
independent from each other. 
2. Markov Sources (see Example 2.1.5): the probability of emission of the n-th 
symbol depends only on the n — 1-th one, namely 


PamG™) = P(in\i1, i2, ++ + İn—1) Paa- (i1i2 -+ in-1) 
= P(inlin-1) Paa- (i1i2 + +in-1) 
= P(inlin-1) P(in-1lin-2) +++ pa(i2ļi1)pa (i1) , 


where p(inļi1, i2,-++in—1) are the conditional probabilities for the occurrence 
of the i,-th symbol if the symbols i1, i2, ..., in—1 have already occurred. Using 
(2.30), it follows that stationarity is equivalent to the probability vector | 74) = 
{pa@}ier, being eigenvector, relative to the eigenvalue 1, of the matrix of tran- 
sition probabilities. 
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2.4.3 Shannon Entropy 


Like an information source A that emits symbols j € {1, 2,..., a} with probabili- 
ties pa (j), also a partition P = [PH , of the phase-space of a dynamical system 
(¥, T, u) into atoms with volumes u(P;), can be interpreted as a classical random 
variable. In the latter case, randomness is related to the fact that the phase-point or 
state of the system is localized within the atom P; with probability u(P;). 

The notion of entropy measures the amount of uncertainty about the outcomes of 
a random variable like P before the phase-point has been localized within a definite 
atom, for instance as a consequence of an observation or a measurement process 
of sort. Equivalently, entropy measures the amount of information, relative to the 
partition P, that has been gained after the phase-point of the system has indeed been 
localized in one of its atoms. 


Definition 2.4.1 (Shannon Entropy) The Shannon entropy of a discrete random 
variable A with probability distribution 74 = {pa( Dyer is given by 


H(A) :=— > paG) log pai) = Yo n(pa(d)) , (2.83) 
j=l j=l 
where 
0 x=0 
n= | —xlogx0<x<1 ee 


Remarks 2.4.1 The Shannon entropy plays for discrete dynamical systems the role 
played by Gibbs entropy for continuous systems which is defined as [200,353] 


Ag (p) := -f dx p(x) log p(x) , 
for a state on the phase-space ¥ with probability density p(x). 


The Shannon entropy is such that H(A) = 0 if and only if one outcome, say 
j*, occurs with probability p4(j*) = 1, while pa(j) = 0 for j 4 j*; it reaches 
its maximum H(A) = loga, when all outcomes are equiprobable, pa(j) = 1/a. 
Indeed, the function 7(x) is concave, whence 


x(logx — logy)>x-—y, Vx,ye€[0,1], (2.85) 


with equality holding if and only if x = y. 
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Let then E be a random variable with tg = {pe(j) = 1/a} 1> then H (E) = 
loga and 


H(A)— H(E)=-)_ pa(j)(log pa(j) + loga)) < (p40) - 1/a) =0. 
j=l j=l 


Given two random variables A and B, we shall keep the notation used for the join 
of two partitions and denote by A V B the random variable with joint probability 
distribution 


TAVB := {Pave(i, Dhiets,jelp , Ia = {1, 2, eeng a} ’ Ig = {1, 2, E b} ` (2.86) 
By summing over the outcomes of A, respectively B, one obtains the marginal 
probability distributions ta := {pa(@)}ier, and Tg := {pgB(j)}jeIg, where 

b a 
pati) = X pavai, j), pe) = X pavali. j). (2.87) 
j=l i=l 
Lemma 2.4.2 (Subadditivity) Given two random variables A and B, 


H(A V B) < H(A) + H(B). (2.88) 


Proof Given Tavpg as in (2.86) and 7m4 and 7g as in (2.87), use (2.85) with x = 
pavs (i, j) and y = pa (Ð) pg (j), 


H(A)+ H(B)—H(AVB)= > pavali, j)log Pave, j) 
ielaj pai) pB(J) 
iel,,jelp 

= > (pava(i, j) — pa) pa(j)) = 0. 
iél,,jelp 


Remarks 2.4.2 


1. As already observed, any finite (measurable) partition P = {Pj}; of (V, T, u) 
is a random variable P whose outcomes correspond to the labels of the atoms 
to which the system phase-point happens to belong. The volumes of the atoms 
give the probabilities of such occurrences, so that a finite partition P also 
attributes to the random variable P the natural probability distribution mp = 


up = {pw Pitierp- 
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2. In analogy with Definition 2.2.1.2, a random variable A is finer than a random 
variable B (B < A) if each outcome j € Ig of B is determined by a subset 
I y C I, of outcomes of A. It follows that, if A has probability distribution 74 = 
{pa@}hier, and B probability distribution mg = {pg(j)}jer,, B < A implies 
peB(j) = Niet! pati). 

3. According to Definition 2.2.1.3, the refinement P vV Q of two partitions P = 
{Pijicrp and Q = {Q;}jec Ig» İS a random variable P v Q with joint probability 
distribution upvo = {u(Pi N Qj)}ieIp, jelo. P V Q is finer than both random 
variables P and Q; also, Q < P => Pv Q =P. 


2.4.4 Conditional Entropy 


Because of possible statistical correlations, the knowledge of a random variable A 
may decrease the uncertainty about another random variable B; the less so, the more 
A and B are statistically independent. These intuitive arguments are formalized by 
introducing the notions of conditional probability, conditional entropy and mutual 
information. 

Consider two random variables A and B with probability distributions 74 = 
{paA@}ier,, respectively tg ={pp(J/)}jer,, and joint probability mavs = 
{pavBli, Jhiet,,jerg- The quantity 


PalB=j(ilj) := PAVN di (2.89) 
PBC) 


represents the probability of the outcome A = i conditioned upon the outcome B = 
j. Altogether, 74)B=j; = {pajj@|j)}¥_, is the conditional probability distribution of 
A conditioned upon the outcome B = j. The conditional probabilities are such that 


a 


pajp=jGlj)=0, X paasGl)=1 Vj=1,2,...b. 


i=l 


The notion of conditional probability is naturally associated to that of conditional 
entropy which measures the amount of uncertainty about a random variable A which 
is left once that relative to another one, B, has been removed. 


Definition 2.4.2 (Conditional Entropy) 

Given two random variables A and B with probabilities 74, 7g as in (2.87) and 
joint probability mavg as in (2.86), the conditional entropy of A with respect to B 
is 
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b 
H(A|B) = Ý ps) H(A|B = j) (2.90) 
m 


PavB(Ì, J, PavB(i, j) 
Epa pa) © pea) 
H(AV B) — H(B), (2.91) 


where H(A|B = j) is the Shannon entropy corresponding to the conditional prob- 
ability 74|B=;- 


Lemma 2.4.3 The conditional entropy fulfils 


0 < H(A|B) < H(A) 
H(A V BIC) = H(A|C) + H(BIAV C) < H(A|C) + H(BIC). 


Proof The lower bound follows since the left hand side of (2.90) is positive, while 
the first upper bound is a consequence of (2.88) applied to (2.91). Further, using the 
latter relation one gets 


H(A v BIC) = H(A V B v C) — H (C) 
= H(AVC)—H(C)+ H(AV BVC)—H(AVC) 
= H(A|C) + H(BIAVC), 


while (2.88) applied to H(A V B|C = k) gives the second upper bound. 


Corollary 2.4.1 B < A => H(B) < H(A). 
Proof From Remark 2.4.2.3 it follows that B < A = > Av B = A; thus, 


H(A) = H(A v B) = H(A|B) + H(B) > H(B). 


Example 2.4.3 If N denotes the (trivial) random variable with only one certain 
outcome, then H(A|N) = H(A) for any other random variable A. 
By the definition of conditional entropy, H(A|B) = 0 implies that in (2.90) 


ts PavB(i, D io PavB(i, j) = 
PBC) PBC) 


Vj. 
i=l 
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Therefore, for fixed j, pava(i, j) = pg(j) for only one i € I4; thus, for each fixed 
i € I4, the index set Ig can be subdivided into disjoint subsets [ B such that 


pati) = >> pavali, i) = >> pati). 
jell, jell, 


That is (see Remark 2.4.2.2) A < B; indeed, the outcomes of A are determined 
by those of B. In other words, when knowing B means knowing A, then A < B. 
Viceversa, if B is finer than A, then H(A|B) = 0; in fact, 


Remarks 2.4.3 
1. Conditioning can be extended to random variables A;,i = 1, 2,...,. The proba- 
bility of the events A; = aj,i = p + 1,..., conditioned on the events Aj = aj, 


i=1,..., pis given by 


plat . **Apap+1 . --An) 
pla . ap) 


’ 


PGp+t i -an| a ap) Hi 
where explicit reference to the random variables in p(---) has been omitted, for 


sake of simplicity. It follows that also the notion of conditional entropy can be 
extended to 


H(AP+1 y APH... ANAL A... VAP) = — Y plana, ...,ap) 


a],...dp 


x > p(ap+1 +++ anla; +++ ap) log Ppt +++ anlai -++ ap). 


Aptian 


2. A sequence of random variables {A/} jeN form a Markov process as in Exam- 
ple 2.4.2.2 if p(ad,|a1---dn—1) = p(an|an—-1) for all n € N. In such a case 
H(A"|Al v -< A”) = H(A"|A"~!). 

3. Since the conditional entropy is positive, it follows that 


H(A V B) > max{ H(A), H(B)}; 


Both this observation and subadditivity (2.88) agree with the interpretation of the 
entropy as a measure of uncertainty. The latter is in fact greater about A V B than 
about either A or B, while, due to possible statistical correlations between A and 
B, the uncertainty of A V B is smaller than the sum of the uncertainties of A and 
B independently. Further, due to (2.85), 


H(A V B) = H(A) + H (B) 
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if and only if tay z factorizes into the product of the probabilities, namely if and 
only if A and B are statistically independent. 

4. The second upper bound in Lemma 2.4.3 yields H(B|A v C) < H(B|C);asC < 
A V C, this inequality is a particular instance of the more general monotonicity 
property of the conditional entropy established in Corollary 2.4.2. 


Example 2.4.4 Suppose a random variable is given by a finite partition P = {P; j 
with atoms that are measurable with respect to a o-algebra & generated by a 
measure-algebra Xo as in Example 2.2.1. Then, for any £ > 0, there exists a partition 
Q= 1034; with atoms Q; € Xo such that H(P|Q) < £. In fact, as showed in the 
example, one can always construct Q such that, for alli = 1,2,...,d, one has 


0<ô<1. 


. ECP) 
P: AQ;) < 
bP; OQ) S90 mn Ja. 


Now, P; C Qi U (P; A Q;) and P; A Q; = (Pi U Qi)\ (P; O Qi) yield 


L(P;) 
CP) < (Qi) + a and (P; A Qi) = w(Qi) — uP; N Qi). 
Th u(Pi) 
us, (Qi) > 5 and ô u(Q;) > (Qi) — (P; N Qi), whence 
pa UCP: OQ) 
=f — ô è 
PEISEN E g a 


Since Tp|Q=;i is a conditional probability, it follows that ppjo=; (jli) < 6 for j # 
i. Finally, choosing ô so that the continuous function n(x) in (2.84) be such that 
n(x) < €/d when 0 < x < ô and 1 — ô < x < 1, (2.91) yields 


d d 
H(P|Q) = X (Qi) $ npo Cli) <e. 
i=l j=) 


Proposition 2.4.1 (Strong Subadditivity) Given three discrete random variables A, 
B and C, the following inequality holds, 


H(AV BV C)+H(B) < H(AV B)+H(BVC), (2.92) 


together with those obtained by cyclic permutations of A, B and C. 
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Proof Similarly to the proof of Lemma 2.88, the result follows by applying (2.85) 
as follows: 


H(AV BV C)+ H(B)— H(AV B)— A(BVC)= 
> PAvBvc(i, j. k)pay) 


Pavsvcii, j, k) log IP u 
iel,, jelp,kelc PAvBU, J)PBvCY, 


<- Ð (pasci i.) 


icla, jelp,kele 


pavs (i, j)pevc (j, #2) =, 
pati) l 


As a consequence of strong subadditivity, the conditional entropy monotonically 
decreases upon refinement of its second argument. 


Corollary 2.4.2 B < C => H(A|C) < H(A|B). 
Proof From (2.92) and (2.91), 


H(A v B V C) — H(B v C) = H(A|B v C) < H(A v B) — H(B) = H(A|B) . 


The result follows since B < C => Bv C=C. 


2.4.5 Mutual Information 


A notion related to the conditional entropy is that of mutual information: it measures 
the amount of information about a random observable A that can be achieved by 
knowing another random variable B. 


Definition 2.4.3 (Mutual Information) Given two random variables A and B, their 
mutual information is given by 


I(A; B) := H(A) + H(B) — H(A V B) 
= H(A) — H(A|B) = H(B) — H(BIA). (2.93) 


The mutual information amounts to the relative entropy (also known as Kullback- 
Leibler distance or information divergence) of the joint probability distribution 7 4y 
with respect to the product probability distribution Tavg = {pa@) Pa (hier,, jeIg 
obtained from the marginal ones (see (2.87)): 


i E T j) 
4 " = , log = nnmn, 2.94 
(is B Tave) 2, Pavel J) log Pati) ps) 
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Since H(A) measures the unconditioned uncertainty about A and H (A|B) the uncer- 
tainty about A if one knows B, their difference amounts to the knowledge of A given 
by B. If A and B are statistically independent, knowing B does not give any infor- 
mation about A, whence H(A|B) = H(A), and I (A; B) = 0. On the other hand, if 
A is finer than B, then knowing A means knowing B, thus B < A => H(B|A) = 0 
and (A; B) = H(B). Vice versa, (A; B) < H(B) means that H(B|A) > 0 or, 
in other words, that the knowledge of B is unable to remove all the uncertainty 
about A. 

An interesting inequality involves the mutual information in connection with three 
random variables A, B and C that forma so-called Markov chain A > B —> C [113]; 
namely (see Remark 2.4.3.2) 


Pavavcli, j,k) PBvc(j-&) 


PcjaAvBaii, (Klis j) = — = pēig=; kj) = - 
LAV BTE pavs (i, j) laa pB (j) 


Notice that C, B and A form a Markovian chain C —> B — A, too; indeed, as 
Pcvs(k, j) = pgBvc(j, k) it turns pout that 


PaicvB=(k,j)G\k, j) := pavgvc(i, j,k) _ PEE pavs (i, j) 
vB=(k, jJ) := —— C = Z oa 
| pcvak, j) ea PcvBik, j) 
PavBti, J) oo 
= —— = Papa (ils) - 
PBs) 


Using the latter property one can prove the so-called data processing inequality[1 13]. 
Proposition 2.4.2 A —> B > C => I (A; C) < 1(A; B). 


Proof From Definition 2.4.3, I(A; B) — I (A; C) = H(A|C) — H (A| B) while the 
Markovianity assumption yields H(A|B) = H(A|B v C) (see Remark 2.4.3.2), 
whence, from (2.4.1), 


I(A; B) — I(A; C) = H(A|C) — H(A|B v C) 
= H(AVC)+H(BVC)—H(AV BVC)—H(C)=0. 


The meaning of the data processing inequality is that the mutual information of 
two random variables A and B cannot be increased by any further processing of B 
by a function C = g(B), for this yields a Markov chain A > B > C. 


Example 2.4.5 When dealing with noisy transmission channels, signals a from a 
source described by a random variable A are encoded into code-words b(a) that give 
rise to another random variable B = B(A). Then, the code-words are sent through the 
channel which outputs signals c = c(b), providing a third random variable C(B). 
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Altogether, A, B(A) and C(B) form a Markovian chain A —> B(A) > C(B), as 
well as C(B) — B(A) — A; thus, we get the data-processing inequalities 


I(A; C(B)) < I(A; B(A)), I(A; C(B)) < I(B(A); C(B)) . (2.95) 


2.4.5.1 Bibliographical Notes 

For a mathematical approach to classical dynamical systems and ergodic theory see 
[18,77, 111,239,370]. Fora more physical point of view on ergodic theory and related 
questions, one may consult [8, 128,200,353]. References [17,352] have been used as 
references for Hamiltonian mechanics and integrable systems, as well as [128, 157, 
271,321] for classical chaos. The review [78] discusses in detail the signatures of 
chaos in discrete classical systems. 

For a modern overview of probability theory see [210]. 

For the notions of entropy and conditional entropy in a dynamical system con- 
text see [18,77,111,197]; consult [113] for the same notions and that of mutual 
information from the point of view of information theory. As regards the relations 
and epistemological links between the entropy of Shannon and those of Gibbs and 
Boltzmann in a thermodynamical setting, see [200,353,372]. 
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Dynamical Entropy and Information 


Repeated uses of an information source or successive localizations of a trajectory 
with respect to a partition of phase-space, give rise to stochastic processes. Since 
equilibrium states u give rise to shift-invariant probability distributions, the Shannon 
entropy is a constant of the motion: for instance, given the time-evolved partition 
T~/(P) in (2.38), one has H, (T7 (P)) = H, (P). Therefore, it is not the Shannon 
entropy, rather the entropy rate that is useful to quantify the degree of irregularity of 
the dynamics. Since its introduction as a mathematical tool, the notion of entropy rate 
or, more generally, of dynamical entropy, has been playing a major role in the theory 
of classical dynamical systems for it provides links among as different properties as 
dynamical instability, informational compressibility and algorithmic complexity. 


3.1 Dynamical Entropy 


As in Sect. 2.4.3, given a dynamical system corresponding to a measure-theoretic 
triplet (V, T, uw), we will consider a coarse-graining of X by means of a finite, 
measurable partition P = {P; Fai and identify P with the random variable (denoted 
by the same symbol) corresponding to the process of localization of the system phase- 
point (state) within one of the disjoint atoms P; that cover X. The outcomes of P are 
the labels of the atoms and occur according to the discrete probability distribution 
uP = {Pp (i) := uP) Further, the time-evolved partition P/ := T~/(P) at 
time j in (2.38) is identified with the j-th random variable of a stochastic process 
{Pi }jez- Thus, the refined partitions P™ with atoms Po as in (2.40) correspond 
to joint random variables with discrete probability distributions as in (2.41), 


(n) _ (n) (3 — (n) 
uit) = [pla = POO) ereats 
p 
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and Shannon entropies (comparing with (2.83), we explicitly indicate the dependence 
of the entropy from the measure and the partition) 


BAP) FY pea) loge PE). (3.1) 


«(n) (n) 
DEQ, 


Definition 3.1.1 (Entropy Rate) The entropy rate of (X, 7, u) with respect to a 
finite, measurable partition P is given by 


1 1 
hes (T, P) := lim —H,(P™) = inf -H,(P™) . (3.2) 
non nn 


The above limit exists because of the stationarity of p, 
H (P) = HP) Vk>0, (3.3) 


and because of the subadditivity of the Shannon entropy [370]. Together, they yield, 
foralO < p<n-1, 


Hy(P™) < Ay(P®) + (VV P*) 
k=p 
= H, (PP) a mr YP») Se ee 
k=0 


where H, := H, (P™®). Fix m € N and set n = km +r, 0 < r < m; then, from 
(2.88) 


H, _ Hm H, 


n m km+r ` 


Since m is fixed, when n goes to infinity, k goes to infinity as well, whence 


; Hn _ Hm 
lim sup — < —. 
n>œ M m 


Since m is arbitrary, it follows that 


. Hn . Hm . . Hn 
lim sup — < inf — < liminf —. 
n>oo M m m n>o n 


The entropy rate can be expressed by means of the conditional entropy (2.91) 
of two partitions H,,(P|Q) in such a way that A(up, To) measures to which extent 
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the knowledge of the past of P may help to predict its future outcomes. Recursively 
using (2.91) and (3.3), one gets 


HP®) = m (P/P) +m (Pe Depa GW Pi) + HCP) 
j=l j=l 
H,(P™) = 11,(p°-"|\V P’) + HPO) 
j=0 
n—1 i-1 


= 4, (P' V?) + H,(P). 


i=l j=0 


Because of Corollary 2.4.2, the positive terms in the sums are monotonically decreas- 
ing with increasing n, thus, arguing as in Remark 2.3.2.1, 


n—i 


nkS (T, P) = lim Ms Gigs = lim (PVP) (3.4) 


T ; 
KS = 4; = i 
hy’ (T, P) = lim ~ S(p 
i= 


VP) = lim, H(p" 


V pi). 6.5) 
j=0 


Consider the first equality in (3.5): Pİ is the random variable whore outcomes depend 
on which atom of P the system state is in at time 7, while Vie 0 1 PJ is the joint random 
variable relative to the atoms visited at previous times ò, 1,...,i—1. Thus, the 
entropy rate corresponding to P is the average information about the next localization 
provided by the knowledge of all the previous ones. 


Remarks 3.1.1 


1. Let (Qa, To, TA) describe a stationary information source. Then, the probabilities 
TAn)» A” = VZ A j, together with the corresponding entropies H (A) refer 
to the statistical ensembles of strings of length n emitted by the source A. As 
discussed in Sect. 2.4.3, repeated uses of the source A can be described as a 
stochastic process {Aj;}jez where Aj is the random variable associated to the 
j-th use of the source. The entropy rate of the source A is thus given by 


h(A) := lim Lamy, (3.6) 
n>on 


The entropy rate h(A) of a stationary source is the entropy per symbol of the 
stationary stochastic process {A/} jez generated by A. 
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2. Because of subadditivity (2.88), the entropy rate of a partition is always bounded 
by its Shannon entropy 
hh (T, P) < H,(P). (3.7) 


Furthermore, since ViMea (PA) = 1, from (3.1), one gets the lower bound 
p 


HP®) :=- $ pa) log p @™) > -log sup uP), 
OR PeP™) 


whence [197] 


1 
A (T,P) =>—limsup—log sup p(P). (3.8) 
n—-+oo M Pep) 


3. If Q < P, Corollary 2.4.1 implies hi (T, Q) < h$’ (T, P). 
4. Since u is T-invariant, the conditional entropy is stationary, namely 


H,(T~'(P)|T~'(Q)) = Ha (PQ). 
Thus, if T is invertible (3.5) can be rewritten as 
n-1 
hS (T, P)= lim m(P|V py, 
j= 


5. Let P and Q be two partitions; then, using Corollary 2.4.1, the relation (2.91), 
Lemma 2.4.3, Corollary 2.4.2 and the previous remark, one derives 


Hy(P™) < HP® v Q™) = HQ) + Hy(P™|Q”) 
n—1 n—1 
< H (Q) + D5 HPO) < HQ) + D> APO), 


i=0 i=0 
whence H, (P™®) < H,(Q™) + n H,,(P|Q) implies 
hy (T, P) < bh (T, Q) + H,(P|Q). (3.9) 


6. Given a partition P, set P, s := Vier Pi, where r < s and r > 0 if T is not 
invertible. Notice that 


n—1 n-1 s stn—-1 st+n—r—1 
VPs VP = V Pe E V r!) i 
£=0 é=r £=0 


£=0 j=r 
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then, since p is T-invariant, from 


n—1 id Rone 1 n+s—r—1 
zt Pes) = m( V P’) 
V n+s—r 5 V j 
£=0 
it follows that h (T, Pes) = hi (T, P). For instance, s = —r = n gives 
n 
be |T, Y PI | =h (T, P), Ynz0. (3.10) 
j=-n 


7. Asbefore, set Q = VZo P’, k > 1; then, P < VZ Q = VE Pf and (3.9) 
yield 


chs (7. P) < A ae =h (T, P). G.11) 


8. After regrouping Via T-i(P)= Va o Voz -ki-i (P), from subadditivity 
and T -invariance of jz it follows that 


kn-1 


mt V Pi) <z ee -En (r~ (VY pity) = La AV) 1 
j=0 j=0 
whence letting n —> +00 obtains he (T,P) < hi” (7, P). 


The entropy rate relative to a given partition P of (X, T, u) strongly depends on 
the latter; for instance, if M is the trivial partition consisting only of ¥ itself and the 
empty set, then T~/(N) = N for all j > 0, whence i (T, N) = 0. The obvious 
way of achieving an absolute entropy rate is to look for the greatest possible one; this 
leads to the notion of dynamical entropy also known as Kolmogorov-Sinai entropy 
(KS -entropy) or metric entropy [205,206]. 


Definition 3.1.2 (KS Entropy) The dynamical entropy of a classical dynamical 
system (X, T, 4) is defined as 


hy (T) := suphýS (T, P) , 
P 
where P is any finite, measurable partition. 


Remark 3.1.2 The dynamical entropy provides a quantity that remains invari- 
ant under isomorphisms between dynamical systems [77]. Two dynamical sys- 
tems (%1,2, T12, 41,2) with o-algebra X1 2 are isomorphic if there exist subsets 
xN c C Xi,2 of measure u, xN) = = | and a one-to-one map ®: xO > a 
such that 
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1. if S2 = (S1) with Sı € XO, then Sı € £; if and only if S2 € Ez and p1 (S1) = 
u2(S2), that is i o Ọ = u and p2 = p1 © P7! relative to x9: 


2. Xi 2 Cc lie ea 2; namely, the specially selected subsets V a must be mapped 
ia themselves by the dynamics; 

5, ee = = T)®(x}), that is Tə o ® = Po Tı and ©! o MH = Tı o 7! relative 
to xi 


Because of these properties, it turns out that, if (%1, A A 2, H1,2) are isomorphic, then 
ht (Tı) = ie (T2). The proof is as follows: if K? > = 4,2, to any partition P| of 
X there gne ponds a partition P2 := ® (P1) and vice versa, the same being true 
of the refined partitions pe that are mapped into partitions Vino Do T (Pi) = 


Lip a (®(P1)) = pe. The result thus follows since p1 po =m pP = 


wi ya = Hp PPY: 
If Xi 0 C 1,2, consider a finite, measurable partition P = (POW , of X; and 
construct an partition P2 of X2 with atoms PO := = o(p” N xo), i=1,2,. 


Pe p= tn \ 4, () Since the latter atom has measure mCP, (2) p= = 0, from az 


properties of ¥ i3 > and the isomorphism ®, it turns out that H, P® = = Ay, em) 
This gives it (Ty) = ie (Tı); indeed, P is a generic partition of 1, but P2 is 
not so for 1; the result thus follows by exchanging the roles of the two dynamical 
systems. 

Concluding, two isomorphic dynamical systems must have the same dynamical 
entropy; since dynamical systems with the same dynamical entropy need not be 
isomorphic, the latter is not a complete invariant [77,111]. 


and 


Example 3.1.1 ([77]) Suppose the discrete-time dynamics of (1, T, u) is sampled 
by observing the time-evolving system not at each tick of the clock, rather every k 
ticks; then 


in Ce = knkS(7) | (3.12) 


Indeed, consider Remark 3.1.1.7: since Q depends on P in a specific way, by varying 
P, one does not in general exhaust the whole class of finite measurable partitions of 
X. Then, 


K? (r*) = > ip i (r. Q) = khis (T). 
On the other hand, Remark 3.1.1.3 and P < Q yield 
KS _ KS (pk KS (rk KS KS (pk 
k hS (T, P) = nk (T Q) > ni (7 ,P) — knkS (7) > nk (T ) , 


The technical difficulty of computing the sup in Definition 3.1.2 is overcome when 
there does exist a generating partition P (see Definition 2.3.5) such that, together 
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with its images at different times Pİ = T~/(P), it provides refined partitions P” 
that generate the o-algebra & of X when n —> ov. 


Theorem 3.1.1 (Kolmogorov-Sinai Theorem) Jf the partition P is generating for 
(X, T, p), then hS (T) = h$ (T, P). 


Proof Consider T invertible (for T not invertible the argument is the same) 
and a generic finite, measurable partition Q; because of the assumption, using 
Example 2.4.4, for any £ > 0 one can find an n > 0 and a finite partition P< 
Pirm- = Vř=-n P/, thatisa partition generated by finite unions of atoms of P—n,n, 
such that the conditional entropy H,,(Q|P) < £. Therefore, from (3.9) and (3.10) 


together with Corollary 2.4.2, one derives 


ni (T, Q) < HAS (T, Pn) + Hyu(QIP-nn) SHES (T, P) + HQP) 
<h (T, P) +, 


whence, choosing Q such that i (T) < h,(T, Q) + €, it follows that 


hes (T) — e < bys (T, Q) < hS (T, P) +e 


with £ > 0 arbitrary. 


The following corollaries are often useful for concretely computing the dynamical 
entropy. 


Corollary 3.1.1 Suppose {Pn}nen is a sequence of finite partitions for (X, T , p) of 
increasing finesse, Pa < Pn+1, such that V „Pn = È. Then, 


hĚS (T) = lim h” (T, Pn) - 
n—> o0 


Proof Given € > 0, let Q be a finite, measurable partition such that hKS (T) < 
i (T, Q) + £; from the assumption and Corollary 2.4.2 it follows that there exist 
n € Nand ð < P,, such that 


hKS (T) — e < hS (T, Q) = hks (T, Pa) + Hu(Q|Pn) 
< bMS (T, Pa) + Hy(Q\9) 
< hXS (T, Pp) +e <hKS (T) +e. 


A similar argument as in the previous proof can be used to show. 
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Corollary 3.1.2 Given (XV, T, u), suppose Xo is a measure algebra that generates 
the o-algebra X of X. Then, 


hKS (7) = sup hKS(T,P) . 


CX 
Examples 3.1.2 


1. Given two dynamical systems (4, Ti, pi), i = 1, 2, their direct product (4X x 
2, Ti x To, pı X u2) provides a new dynamical system (1, T, u) consisting 
of two statistically and dynamically independent components. Concretely, X := 
A X Xp is the phase-space consisting of points x = (x1, x2), x1,2 € 41,2 and 
the dynamics T is such that Tx = (Tiıxı, T2x2). Furthermore, if X1,2 are the 
o-algebras of 4,2, then VY remains equipped with the o-algebra X = X; x X2 
of measurable sets of the form Sı x S2, $1,2 E X1,2 and with the T-invariant 
measure on 1, u = u1 X u2, defined by u(S1 x S2) = pı (S1)u2 (S2). Then, [77] 

(Ti x To) = hip (T1) + hi (To) . 


hk a H2 H2 


Indeed, £ is generated by the measure algebra Up, 5 Pı x P2 where P12 are 
generic finite, measurable partitions in ¥1,2; thus, from Corollary 3.1.2 and sta- 
tistical independence, 


hKS(T)= sup hh’ (T, Pi x P2) 
PixP2 


mani (Ti, P1) + BKS (Tp, Pa) 


_ ys (Ti) + BIS (M) . 


2. Bernoulli Systems: (see Example 2.1.4) let u be a product measure such that 
pMi™) = es p(i;). As seen in Example 2.3.3.1, the partition C of Qp into 
co = {i € Qp : ij € {1,2,..., p} is generating for the o-algebra of cylinders. 
Therefore, 


1 
hy? (To) = hy (To, C) = lim —H,(C) 
n—->oo 


= jim = » Tee Ye pty 


meq) \i=0 


p 
-J plog pi) = HC). 


i=l 
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3. Markov Processes: Let the measure in (2p, To, n) be given, as in 


xample i = pio ij\ij-1) on ain, the par- 
Example 2.4.2.2, by p® ) = p(io) V7) pGjlij—1) on Qh”. Again, the p 
tition C of the previous example is generating. Therefore, since ma ı p(ilj)= 1, 


1 
hS (T3) = hS (T,, C) = Jim -Hr (C™) 


n—-1 


=- lim Ż 2 pio T] pst D(p Go) TT 10s pti :)) 


meg” j=l 


p 
=- Ý pli)pCili) log pCi) . 


i,j=l 


4. Ergodic Rotations: Consider the irrational rotations on the T? described by the 
triplets (r, T, d0). As seen in Example 2.3.3.2, there is a generating partition 
C such that 


V TYO = £ =T) = VTO), 


where the last two equalities follow from the invertibility of the dynamics T. 
Then, as in Example 2.4.4, for any £ > 0, one can find an n € N and a partition 
Cx Vie , T~/(C) such that H, w(C|C) < < e. It thus follows from Corollary 2.4.2 
that 


meV TI) < HCI) <e, 


hKs 


whence, from (3.4), p T)= 0. This very same argument holds for all reversible 


dynamical systems (x, T, n) that possess a partition P which generates the o- 


algebra of ¥ as © = Vie ar. 
5. Non-ergodic Rotations: Unlike in the previous example, there exists k € N such 
that T* = 1, the trivial dynamics with hKS (1) = 0. Then, from Example 3.1.1, 


0 = hKS dh = hÉS (T4) = k hý (T). 


3.1.1 KS Entropy and Lyapounov Exponents 


In Sect. 2.2, Lyapounov exponents (see Definition 2.1.2) have been introduced as 
indicators of hyperbolic behavior, that is of exponential separation of initially close 
trajectories. In Example 2.2.2 this has been calculated to be log 2 for the Baker map, 
which is isomorphic to a Bernoulli shift (Q2, To, ug) with a balanced probabil- 
ity measure upg; therefore, according to Example 3.1.2.2, the Lyapounov exponent 
equals the KS entropy for this system. 
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From an informational point of view this fact can be understood as being due to 
the loss of information along the direction where distances and thus errors increase 
exponentially fast [78,321]. Itis therefore plausible to expect that all possible expand- 
ing directions contribute with their Lyapounov exponents to the loss of information 
and thus to the KS entropy (see Remark 2.1.3.1). This is indeed the content of the 
following theorem [239]. 


Theorem 3.1.2 (Pesin Theorem) Let (4, T, u) be a smooth dynamical systems as 
in Remark 2.1.3.1; set A(x) := 5 \P (x) dim Wj (x); then, 
PAP (x)=0 


a) f du (x) A(x) . 
X 


When the dynamical triplet (V, T , p) is ergodic, the Lyapounov exponents, which 
are constants of the motion, are constant almost everywhere on X, whence Pesin’s 
equality assumes the simpler expression 


KS j 
bys AY: 
PAD>0 


A particular instance of Pesin’s result applied to hyperbolic dynamical systems 
[197,370] is provided by Example 8.2.4. 


Proposition 3.1.1 The KS entropy of the hyperbolic automorphisms of the torus 
with positive eigenvalues a?! of the matrix A is 


hiv (Ta) = log a. 


Standard proofs of this result can be found in [197,330]; here, we prefer to defer 
it to Chap. 8, where it will be obtained by means of a quantum dynamical entropy 
(see Proposition 8.2.7 and Remark 8.2.4). 


3.1.2 Entropic K-Systems 


In Sect. 2.3.1, K-systems have been defined in terms of the existence of a K -sequence 
{X}nez of nested o-subalgebras (see Definition 2.3.4) or of an algebraic K -sequence 
of nested Abelian von Neumann subalgebras (see Definition 2.3.6). We will now show 
that the algebraic characterization is equivalent to the following entropic properties, 
the link being the triviality of the tails of all finite partitions (see (2.73)). 


Theorem 3.1.3 ({111,258]) Let (X, T, p) be a dynamical triplet, the following ones 
are equivalent properties: 
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1. there exists a K-sequence {Pn}nez based upon a finite generating partition P 


(see Definition 2.3.5); 
2. Tail (Q) = N for any finite measurable partition, where N is the trivial partition 
of X; 


3. for all finite measurable partitions Q of X, 
he (T, Q) > 0; (3.13) 
4. for all finite measurable partitions Q of X 


lim hi’ (T”, Q) = H,(Q) ; (3.14) 


n— +00 


5. for any two finite measurable partitions Q; 2, 


tim Hy,(Qi] V T*(Q2)) = Hy(Q1) : (3.15) 


n—>+00 
k>n 
6. for any two finite measurable partitions Q1 2, 


lim H,( Q11 Vy T~*(Qo)) =0= Q =N. (3.16) 


n—> +00 
k>n 


From the characterization of K-mixing by the triviality of the tails of all their 
finite partitions (condition (2) above), Proposition 2.3.5 gives 


Corollary 3.1.3 A dynamical triplet (¥, T , p) with a finite generating partition is 
a K-system if and only if it is K -mixing. 


The key observation in the proof of Theorem 3.1.3 is the continuity of the con- 
ditional probabilities as stated in Theorem 2.2.1 and the continuity of entropies and 
conditional entropies with respect to their arguments. This fact allows us to recast 
(3.4) in the more suggestive form 


hS (7,P) = lim 1, (P\\/ p!) = 11,(P|\) P’) , G7) 
j=l j=l 


where PI = TTI (P). Also, by means of (2.73), in (3.15) and (3.16) one rewrites 


„ïm H,,(Qi1 V T's) = a S tip, V T=) 


k>n 


= H (Q1 Tail (Q2)) . (3.18) 


We shall also need the following two results [111]. 
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Lemma 3.1.1 Given two finite partitions Q1 2, 


+00 
H, (Q11 V T-"(Q1) V Tail (Qo) = BS (T, Q1) . (3.19) 


n=1 


Proof Asa first step, observe that, given a finite partition Q, repeatedly using (2.91) 
and (3.3) yield 


n—-1 


1(\/ roly T= (Q) = Z a(o) Vr io) 
k=0 j=1 


j=k+1 
m(olV T~1(Q)) = nhiS (7, Q). 


Then, for fixed ¢>0 and sufficiently large n, using Corollary 2.4.2 and 
Definition 3.1.1 one gets 


n—1 


bKS (7, Q1 V Q) = + (V reve|V Ti-\(Q1 v Q)) 


Lı(n) 


n—1 


s> "(V revey TiQ) 


L2(n) 


n—-1 


1 
a H,(\/ T*(Qi v 22) < his (T, Q1 V Q2) +e. 
k=0 


_ Lin) ~ Lam) 
Thus, lim ——— = lim . Further, Lemma 2.91 yields 
n—> +00 n 


n n—> +00 


n—1 +00 
Li) = 8, (\V Trd V Tie v Q)) 


k=0 j=l 

Lii(n) 

+00 l 
+H, AV rk(an|VYV TI(Q)v V TQ) 
j=l j=—n+1 
Li2(n) 
n—-1 +00 
Lan) = m (VV Trd V Tien) + n(V ron] VY V TQ) 

k=0 j=l j=—n+1 


Lai (n) La(n) 
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Corollary 2.4.1 implies Z1;(n) < L21 (n) and L11 (n) < L2\(n), then 


L L 
hÉS (T, Qı) = lim 0 = lm | 
a= 


+00 n n—>+00 n 


By applying (2.91) and then the argument that led to (3.4) one gets 


L 
ns (T, Qı) = lim — 
n—=> +00 n 
12 +00 +00 
a E k -j -r 
= lim -J H(t |V TI@Q)v V Te) 
k=0 j=l r=—k+1 


n—-1 


+00 Foo 
=n, SS ulel Y merre) 
k=0 r=! 


j=k+1 


lim Hy (Qi V rayv V T-"(Q1)) = H,(Qi| A Cn) 
j=n r=1 n>0 


Ch 


IA 


+00 
H, (Q1 [Tait (Q2) v V TD) < n$ (T, Q1) . 
r=l 


The last equality follows from the fact that the partitions C,, (not finite in general) 
are such that C, < C,—1, whereas for the last but one inequality Corollary 2.4.2 has 
been used and the fact that 


+00 TOO 
Tail (Q2) v V TQ = | A V T*@2) | y V TTD NG. 


r=1 n>0 k>n r=1 n>0 


Lemma 3.1.2 Given two finite partitions Q1 2, 


Q2 < \/ T"(Qi) => Tail (Q2) < Tail (Q1) . (3.20) 


neZ 


Proof We shall show that all partitions Q < Tail (Q2) are such that Q < Tail (Q1), 
too. Notice that Q < \/,,-7 T” (Q1), by hypothesis. If 


ne 


H,(P|Tail (Qi) v Q) = H,(P|Tail e) Œ% 
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for all finite partitions P < \/i__, T*(Q)), then by approximating Q arbitrarily 
well by \/7__, T¥ (Q1), continuity allows one to substitute Q for P in (x). Then, 
(2.91) implies 


H,,(Q[Tail (Q1) VQ) = H,( Q[tail (Q1)) => Q < Tail (Q1) . 


Equality (x) is proved as follows: a repeated use of (2.91), together with the 
T -invariance of Q (see Remark 2.3.4) and Remark 3.1.1.4, yield 


H,f V Tt] VV T~1(Q1) v Q) 


k=—-n j=n+1 


Li(n) 


n 


= g(r ian VV TiQ v Q) = 2n (ail TiQ) v Q) 
k=1 


k=—n j=-k+1 
as well as 
m(\/ THQn| Vir (Qi) = 27 H, dalV ian) 
k=-n j=n+1 
L2(n) 


= 2nh (T, Q1). 


Since Q is T-invariant, it coincides with Tail (Q), whence Lemma 3.1.2 ensures that 
Lı(n) = L2(n). Furthermore, since 


P<\V TOD = Pv V TD = V rk), 
k=—n k=—n k=—-n 


by using (2.91) one gets 


Li) =H, (P| V rev a) 


j=n+1 


~ 


Lii(n) 


+4,(\/ TK(Q)|P v V T~i(Qi)v Q) 


k=-—n j=n+1 


~ 


Li2(n) 


Lan) = H,(P| Vir Han) +m V T*(Q,)|P v Vir io). 


j=n+1 j=n+1 


La (n) Ln (n) 
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Since L11 (n) < Lai (n) and Li2(n) < L22(n), Li (n) = L2(n) gives 
+00 +00 
H,(P| V TQ) vQ)=H,(P| V Te) 
j=n+1 j=n+1 
for all n > 0. Therefore (see the proof of the previous lemma), 
+00 
m(P|A(V TIQ) v 2) = H,(P{Tail (21)) 
n>0 j=n 
The equality (x) thus follows from Corollary 2.4.2 and 


+00 
Tail (Q1) VO x A(V T-i(Q1) Vv Q) so that 


n>0 j=n 


H,(P{Tail (Q1)) > H, (P|Tail (Q1) V Q) 


> AGINA T- (Q1) v Q)) = H,(P|Tail (Q) . 


n>0 j=n 


Proof of Theorem 3.1.3 The equivalences will be proved according to the following 
scheme: 


(5) => 4) 
y 
(1) = (2) => (3) 

Ç 

(6) 
(1) => (2): take Q; has the K-partition P, then Lemma 3.1.2 implies Tail (Q) < 
Tail (P) = N for all finite partitions Q. 
(2) => (1): this is the content of Proposition 2.3.6. 
(2) => (3): if hKS (T, Q) = 0 for a finite partition Q, by means of (3.17) and the 
argument of Example 2.4.3 extended by continuity to non-finite contexts, one gets 
Q < VIZ T” (Q). Then, T™(Q) < V} 2+1 T™” (Q), for all k > 0, whence 


+00 
Tail (Q) = A V T™O = VT" eo QN. 
k>On>k n=0 


(3) => (2): let Q? be a finite partition with Tail (Q2) 4 M; Lemma 3.1.2 applied 
to Q; < Tail (Q2) yields hK’ (T, Q1) = 0, whence Q1 = N. 


86 3 Dynamical Entropy and Information 
(2) => (5): using (3.18) one gets 
H, (Q1 [Tail (Q2)) = H,(QiN) = Hy (Q1) 


for all finite partitions Q) 2 (see Example 2.4.3). 
(2) = > (6): follows from (2) => (5). 
(6) => (2): suppose Q is a finite partition; if Qı < Tail (Q2), (3.18) yields 


0= H, ( Qi Tail (Qs)). Thus, Q; = N from (6), whence Tail (Q1) = N. 
(5) = > (4): consider a finite partition Q and notice that 


+00 +00 
V T™O = V TO). 


k=n j=l 


Then, Corollary 2.4.2, (3.17) and (3.7) imply 


HO = lim (eV TO) 


k=n 


IA 


„lim H,(Q|\/ 7-"(Q)) =bKS (7", Q) < H,(Q). 
j=l 


(4) => (3): given a finite partition Q 4 M, choose £ > 0 in such a way that 
H,,(Q) — £ > O and n large enough to have hKS (T”, Q) > H,,(Q) — £. Then, from 
(3.11) one derives 


A,(Q) =E S 
n 


1 
KS KS 
hy” (T, Q) 2 =h; (T”, Q) > 0. 


3.2 Codes and Shannon Theorems 


As seen in Sect. 2.4 communication channels usually comprise a preliminary encod- 
ing of the source signals. In the following, we shall review some basic facts concern- 
ing the role of entropy in this context, with particular reference to compression of 
information and its transmission through noisy channels. 
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Definitions 3.2.1 (Codes) 


1. A code £ : [4 + Q% fora source A with alphabet 74 = {1, 2,..., a} is any map 
which associates source symbols i € 74 with strings of any lengths consisting of 
symbols x € Ix = {1,2,...,d}: 

Ia Sim EM =x = xixe An EOF, xj € {1,2,...,4}, 
where Q3 denotes the set L),,. ; g0, 

2. A code is non-singular if any two different source symbols i, j € Z4 are mapped 
into different code-words £ (i) # E(j) € Q%. In this way, any code-word corre- 
sponds to a unique source-symbol. 

3. The extension of a code £ : I4 > 27 to strings iM = 1ji2---ig € QP of length 
£ is defined by concatenation: 


2© 31 > EO GO) = EGEG2) Elie) € OH. 


4. A code £ is uniquely decodable if its extensions € are non-singular. 
5. Acode € is a prefix or an instantaneous code if no code-word prefixes another 
code-word, that is ifno code-word consists in code-symbols added to a code-word. 


Examples 3.2.1 ([113]) 


1. Prefix-codes are uniquely decodable and uniquely decodable codes are non- 
singular. 

2. LetI4 = {1, 2, 3}, Ix = {0, 1}, E(1) = 0,€(2) = 00, €(3) = 01 is a non-singular 
code, but not an uniquely decodable one for €(11) = €(2) = 00. 

3. The code E(1) = 0, E(2) = 01, E(3) = 11, is not a prefix-code as €(1) pre- 
fixes €(2). However, it is uniquely decodable for the following reason. Suppose 
EG) = EGG) =x: ifx] = 1 then x = Landi, = ji = 3;ifx; = x2 = 0, 
then 7; = jı = 1. Finally, if xı = 0 and x2 = 1 then i; = jı = 1 when x3 = 1, 
otherwise i} = jı = 2. In this way every string of code-words encodes a unique 
source-word. 

4. The code €(1) = 0, €(2) = 10, E(3) = 11 is such that no string can be prefix 
to another. Unlike in the previous one, in this case one need not check the next 
symbol in order to identify the corresponding source-symbol. 


Prefix-codes are particularly important because the lengths of their code-words 
satisfy the following inequality. 
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Proposition 3.2.1 (Kraft’s Inequality [113]) Zf E : I4 —> Q% is a prefix-code over 
the alphabet Ix = {1,...,d} for a source alphabet I, = {1,2,...,a} and ĉi 
denotes the length of the code-word E(i), then 


ya <1. (3.21) 


This inequality is known as Kraft inequality; vice versa, if a set of lengths £i, i = 
1,2,..., a satisfies (3.21), then there exists a prefix-code E : I, > Q3. 

Proof The lengths £; need not be all different; let them be ordered such that £1 < 
ty <--+ < m,m < a and let N; be the number of source-symbols with code-words 
of length ¢;. Necessarily, Nj < a, otherwise there would be more source-symbols 
than words of length £; that encode them and the code would be singular. The 
prefix condition means that none of the N; code-words can prefix code-words of 
length £2, whence N; d’2—"! code-words are no more available and non-singularity 
implies Ny < < d? — N dmt, Continuing, N2 d°3—® and N; d7“ cannot be used 
as code-words of length £3, whence 


N3 <d3—N,d9-" — Nde. 


Iterating the argument one gets a set of inequalities 


the last one (j = m) resulting in (3.21). Vice versa, if a set of m different lengths £; 
satisfy the Kraft inequality, then they also satisfy the inequalities 


j-1 
Nea & <y <1= Nj sd" — YO Ndi, 
k=1 i=l k=1 


for 1 < j < m. Therefore, the source-symbolsi € I4 can always be regrouped into 
subsets [,4(j), each with N; elements, such that there are sufficiently many code- 
words to construct a prefix-code I4(j) 3 i œ> E(i) € Q3. 


Example 3.2.2 ({113]) Inequality (3.21) extends to countable prefix codes. Indeed, 
any x™ = (x1, X2,..., Xn) € oe can be associated with the interval A, := 
[O.xjx2-- “Xn 0.x1x2 -Xn +d") C [0,1] by means of the d-nary expansion 
x= Le 1 4 =: 0.x1x2 - - - Xn. Therefore, if a countable set {x; } en of code-words 


x“ e Q7 with lengths £; have the prefix property, the corresponding intervals A; of 
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lengths d~“ are all disjoint and the sum of their lengths cannot exceed 1. Viceversa, 
given a countable set of lengths satisfying the extended Kraft inequality 


ye eek 


ieN 


these can be assigned to disjoint dyadic intervals whose left ends can be used as 
code-words of a prefix-code. 


Given a source A some codes will prove more adapted to its statistical properties 
than others; for instance, it is convenient to assign shorter code-words to the symbols 
emitted with higher probability. In this context, a useful parameter is the following 
one. 


Definition 3.2.1 ((Average Code Length) [113]) Let A be a source emitting symbols 
from the alphabet 74 with probabilities 7 = {p(i)};cz,, the average length of a code 
E : Ia > Q3 is defined by Lr (E) := yi Pli, where £; := €(E(i)) is the length 
of the code-word E (i) assigned to the i-th source-symbol. 


A way to optimize a code relative to a fixed source probability distribution is to try 
to achieve the shortest average length. If € is a prefix-code for which (3.21) becomes 
an equality, the optimal lengths are found by imposing that the quantity Lr (E) + 
MDL qi — 1) be stationary upon variation of the lengths and of the Lagrange 
multiplier A. Since )“¥_, p(i) = 1, one gets à“ = — logd and £¥ = — log, p(i), 
whence the corresponding average length equals the Shannon entropy in base d, 
L* = H(A). This is the smallest one achievable by a prefix code; indeed, with 
D> yy dti < 1, by means of the relative entropy (2.94) and of (2.85), one 
estimates 


a a D 
Ly(E)— L* = Y` pÀ (ti + logy PW) = Y p@ loga (o) — log, D 
i=l i=l 
= S(7, 7) — log, D>0, (3.22) 


where 7 = {d~"/ D}"_,. Since €* is not generally an integer, it cannot be directly 
used to construct an optimal code; however, set li := [—log, p(i ïj! so that L; < 


£; < €} + 1 and 


a a a 
Patsy diy pis. 
i=l i=l i=l 


' [x] denotes the smallest integer larger than x € Ry. 
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According to Proposition 3.2.1, one can thus construct a prefix-code E with average 
length Lz (E) such that 


Ha (A) = L* < L,(E) =J pli < L* +1 = Ha(A) +1. 
i=l 


These upper and lower bounds also characterize the average code length Lz (Eopr) 
of any optimal code for L4 (£) > Lr (Eopt) = L*. 


Example 3.2.3 (Shannon-Fano-Elias Code) Let A be a source that emits symbols 
i € I4 ={1,2,..., a} with probabilities m = {p(i)}?_, and assume, without loss 
of generality that p(i) > 0. Let P(i) := ye p(j); then, to each symbol i there 
corresponds a jump from P(i — 1) to P(i) and the value Q(i) := P(i — 1) + p(i)/2 
belonging to the corresponding step can be used to identify the i-th symbol. Since a 
code-word must contain a finite number of symbols, a suitable truncation of Q (i) is 
necessary; for this the binary expansion of Q (i) is used. Concretely, one assigns to 
the i-th symbol the code-word €(i) = xı (i)x2(i) -- - xe, (i), where 


— log, pi) + 1 < £i := [ — log, p(i)| + 1 < —log, pli) +2, (3.23) 


and x; (i) € {0, 1} are the binary coefficients of the expansion of Q (i) truncated at 
the £;-th digit: 


ti 


, xj(i) SÒ aa, l opp, PO p; 
OO ira A a Oa OO SO: 
j=l j=4+1 
— 
aw 


Since P(i — 1) < O(i) < P(i), O(i) provides a code-word €(i) for the symbol i 
of length £; < — log, p(i) + 2. Also, with the notation of Example 3.2.2, the binary 
intervals 


[ 01 @x2@) re @, OADD xD +278 | 


lie within the steps corresponding to different i’s and are thus disjoint. Then, € is a 
prefix-code with average length satisfying 


a a 


H(A) < Lr = D> pli) i = D> p@([=logy p01 +1) < H(A) +2. 


i=l j=l 


Remark 3.2.1 The difference between the average code-length and the entropy 
Hq(A) in the case of the assignment £; = [ — log, p(i)], can be eliminated asymp- 
totically by coding not single source-symbols but whole blocks of them. In this 
case, given a stationary source A and a prefix-code E, one encodes strings of length 
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n, i” € Q9, with code-words E™ (i0) € Q% of lengths £“) and average code- 


i (n 


length per symbol 


1 
a .— X n) p 
Lra (EM) = 5 Pam i" eum : 
iNeo 


Then, the same argument developed for codings of single source-symbols yields the 
bounds 


Ha(A™) — 


Hg(A™) 1 
= Lagi (E™) < UAD aapa 


n n 
Taking the limit n — ov, one sees that the average code-length per symbol tends to 
the entropy rate (in base d) ha (A) of the source (see Remark 3.1.1.1). This simple 
result motivates the following interpretation: 

The entropy rate of a stationary source represents the expected number of code- 
symbols needed to optimally describe the whole stochastic process corresponding 
to the source. 


3.2.1 Source Compression 


Storing or transmitting information consumes a certain amount of resources, like 
the number of uses of a channel or the allocation of memory. In order to minimize 
the costs, the strategy is to compress information as much as possible in such a way 
that it could be efficiently retrieved, that is with small probability of errors. We shall 
start with the case of binary Bernoulli sources A emitting statistically independent 
signals (see (2.31)). 

In such a case, the source amounts to a stochastic process {A} } jez consisting of 
independent and identically distributed random variables, each with discrete proba- 
bility distribution 74 = {p(i)}#_,. Then, the mean value of the random variable 


n—1 


L,(A) = =- X log p(A/) (3.24) 
j=0 


is the Shannon entropy H(A) = >. pi m) L,„(i w, while the variance equals 
Men” 


1 
Va (A) := ((Ln(A) — H(A))?) = ~(log” p(A)) — H?(A). 


Lemma 3.2.1 (Tschebitcheff Inequality) Let X be a random variable with out- 
comes i = 1,2,...,d, probability 7 = (POY, mean value M := (X) and vari- 


V 
ance V := (X?) — M?, then Prob{|X — M| > 6 < DE 
€ 
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Proof The upper bound follows from 


1 
Prob{|X — M> 9:= > POSS yp - MY. 


i:|i —M |>e i:|i— M|>e 


With X := L, (A) as in (3.24), the previous Lemma yields 


Prob {|Ln(A) — H(A)| 2 &} < = (log? p(A) — H7(A)) . 


Therefore, chosen € > 0 and 6 > 0, for n sufficiently large, one can select high 
probability subsets 


1 
AR = {i € QM : |-— log pP- HA] <e}, 625 

such that 
Prob(.A\"?) >1-ô, Prob((A\"})°) <ô, (3.26) 


where (AY a oO? \ AY is the corresponding low probability subset. 
Proposition 3.2.2 (Asymptotic Equipartition Property (AEP)) For any € > 0 and 
ô > 0, there exists N..5 such that, for all n > Ne à, the high probability subsets 
Ay Ë QP are such that, for alli™ € A”, 

oe MH(A)+©) < pa”) < e7" HA- , (3.27) 
while, their cardinalities #(A®) satisfy 


=r < HAY) 2 tre | (3.28) 


Proof The first statement follows from (3.25), while the second one is a consequence 
of (3.26) and of 


l = ) pi”) > ` pi”) > #(AM) e HA 
B = € 
iem iMe A” 


i-ge J, pe) etree. 
iMeAM 
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Roughly speaking, the AEP states that, for large n, the binary strings of length n 
can be subdivided into a high probability subspace AY containing ~ e”” 4) strings 
each one of them occurring with probability ~ e~”" 4), Also, the closer the source 
entropy to 1, the closer AS gets to g0, 


1 
For Bernoulli sources, the AEP amounts to — log p(i (n) ) — H(A) in probabil- 
n 


ity. In fact, the AEP extends to ergodic sources and, more in general, to symbolic 
modeling of ergodic dynamical systems, with the Shannon entropy replaced by the 
entropy rate. 

LetP = {P; j _ denote a finite, measurable partition of a reversible ergodic triplet 
(X, T, u)andset P$ := V5=r PI, PI := TÌ (P). Further, for any x € Æ let P? (x) 
denote the atom of the partition P* that contains x: for p-almost all x there is one 
and only one such atom. Notice that each P% is a random variable on ¥ such that 


S 
PO) TIP) > Tix € P; Yj=r,r+l1,...,S. 


j= 


Consider now the random variable 
1 = 
In (x) := —— log uP lœ); (3.29) 


with the notation of Sect. 3.1, its expectation is 


: n—1 
Olin) = | UOD DT fe, Ooga A) 

x n Pp” 

Meh in) 

1 1 

=—— Jo MPio) log HP) = HP”). (3.30) 
ieg 
-1 
É UPK) 


. 1 1 0 
Rewrite h(x) = —— 5 lo — — log u(P(x)), Po = P, and observe 
n 


n°? PET) 
that u(Pkœ) = u( P? (T*»)) and u( P'o) =- w( P= Ta): Then, 


n—1 


hn(x) = X ox(T*x) where (3.31) 
k=0 


uP? (x) 


—. (3.32) 
(Paz (x)) 


go(x) := — log u(P(x)) , g(x) := — log 
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All these functions are positive; furthermore, 0 < g := lim, gg exists almost every- 
where and is integrable. In fact, let fý := gj }P;, that is 


HPO N P) 
BPE) 


fix) := 


then, the conditional probability (2.52) of the random variable P conditioned on the 
measure algebra generated by Ph reads 


p(P =i| P=) @) =F | 


Since, from Theorem 2.2.1, lim, fi exists u-almost everywhere, the same is true of 
g = limg gx. Now, fix a € R and define the following disjoint subsets of V: 


Ex = [x : max gj(x)<a< oxx)| 
I<jsk-1™ 


Fi := [x . max f]0) sua fico} 


Using the defining property (2.52) of conditional probabilities, one estimates 


P p 
MED = DMPO FR) = Do f, dn u(Pi |P) 
i=l i=l k 


< e™" u(F$) and 
oe) p oe) 
ae 5a (Ü ri) < pe. 
k=1 i=l k=1 


Setting g := sup; gk and Gg := {x : k < g(x) <k+ I}, 


ug) = >| du (x) gx) < Dk + De* < +00, 
k=0 " Gk k=0 


whence g and g are both integrable. 


Example 3.2.4 Consider the case of a bilateral Bernoulli shift as in Example 3.1.2.2. 
Then, x = i € Q4, and, choosing as P the generating partition C, one gets gx (i) = 
go(i) = — log u(C (i)). Therefore, the sum in (3.31) yields the time-average of go, 
whence one can apply Birkhoff’s Theorem 2.3.1 and ergodicity to deduce that 


lim hn(i) = H,(C)=hh (Tz) p—ae. 
n—- Oo 


and that the asymptotic behavior p(i™) ~ e™” b=) holds almost everywhere and 


not only in probability. 
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Despite the fact that, in general, the functions in (3.31) are different for different 
k’s and thus (3.31) is not a time-average as in Birkhoff’s theorem, none the less the 
following result holds. 


Theorem 3.2.1 (Shannon-Mc Millan-Breiman Theorem) Let (X¥,T, pu) be a 
reversible, ergodic dynamical system, then, for all finite, measurable partitions 
P= nee 


lim An(x) = hks (T,P) p-a.e. 

n—> oo 
Proof ([77,239]) With the notation introduced in the preceding discussion, domi- 
nated convergence, T -invariance of u and (3.30) together with (3.31) yield 


n—-1 n-1 


=i = im = — im k 
Hg) = Jim ulg) = lim > X ug) = pie J uao T”) 
= lim mhn) =h¥S (T, P) . 
n—>o0 
1 n—1 
3 KS _ — E k 
On the other hand, from ergodicity, h, (T, P) = (g) = lim — > g(T"x) u — 

n>n E 


a.e., Whence the theorem is proved by showing that 


n—l1 
lim — X a —g)(T*x)=0 p-ae. (x) 
n>n r 


Consider Gy (x) := sUpz>y lg9k(x) — g@&)|; these functions are integrable and 
limy Gy = 0 p-a.e., thus, from ergodicity, 


n—1 


1 
< limsup — J |(g- g)(T*x) 
n>oo M k0 


, 1 n—1 
lim sup |— $ (gr — g)(T*x) 
n>oo |M k=0 


n—-1 


< lim sup — Gy(T*x) = u(Gy) 
n>oo M z0 


p-a.e. and for all N € N whence (+). 


Remark 3.2.2 The Shannon-Mc Millan-Breiman theorem applied to an ergodic 
source allows a reformulation of the AEP in terms of the KS entropy. Indeed, choos- 
ing as P the standard generating partition as in Example 3.2.4, almost everywhere 
convergence of pP aM) to e™” #4), ensures that, given e > 0, and ô > 0, for n 
sufficiently large, the ensemble of strings of length n can be subdivided into a high 
nh(A) 


probability subspace A” of probability ~ 1 containing ~ e strings. 
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The AEP allows the implementation of the following compression scheme of 
an ergodic binary source: one considers strings of length n, makes a list of those 


contained in a high probability subset A” 


and assign them as a code their position in 
the list. Since A” contains less than 2”"“4)+9 strings (entropies being conveniently 
computed with logarithms in base 2), the number of bits needed for the encoding is 


at the most 
[log, 27449] +1 =[h(A) +e] +1, 


while the strings belonging to the complementary set (A) may be encoded 
by a same integer, say #(AM”) + 1. Upon retrieval, the strings belonging to A® 
are exactly identified by their code, but not those in (AM)¢; however, since 
Prob((A!”)°) <6 and 6 > 0 with n > on, the larger n gets, the lower is their 
probability of occurring. Therefore, the probability of error can be made vanishingly 
small by increasing n. 


Theorem 3.2.2 (Noiseless Coding Theorem) Let A be an ergodic binary source 
with entropy rate h(A): binary strings of length n can be encoded by usingn R < n 
bits and vanishing probability of error if R > h(A). If R < h(A), then the probability 
of error goes to 1 withn — ov. 


Proof The first part of the theorem follows from the previous discussion by applying 
the equipartition theorem with R = h(A) + e. 
For the second part, let R = h(A) — e and consider the high probability subset 


An together with its complement (A) The probability of any subset B of Qw 


containing |2”* |? strings can be estimated as follows, 


Prob(B) < Prob(B n as) 4 Prob( B n A) 
<6 +4 gnRa—nh(A)—€/2) =ő ae gone/2 , 


where ô is a vanishingly small quantity given by the AEP . Thus, listing the strings 
belonging to a subset as B, one uses less than h(A) bit per bit , but, when n gets 
larger, the probability that an emitted string belong to B gets vanishingly small and 
the probability of error close to 1. 


3.2.1.1 Universal Source Codings 

The compression protocols discussed in the previous section depends on the knowl- 
edge of the source statistics. Interestingly, encoding and decoding schemes exist 
which work equally well, namely with a same compression rate R, for all ergodic 
sources A with an entropy rate A(A) < R, whatever their overall stationary proba- 
bility distribution: these protocols provide universal source codings. 


2 |x| denotes the largest integer smaller than x € R. 
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In the following, we shall consider Bernoulli sources [113], while the general 
case can be found discussed in [201,386]. The method used is based on the concept 
of type. 


Let A be a stationary Bernoulli source emitting strings i”) = iji2--- in € 2”), 
i; € Ia = {1,2,...,a} according to compatible probability distributions 74a) = 


| pa) = Ter pati}. We shall denote 


1. by N(j|i™) the number of times j € I4 occurs in the string i”; 
N . .(n) 
2. by p( jli ™) = NGH”) the so-called empirical probability generated by the 
n 

string i” and by Hw = PACIL the corresponding empirical distribu- 
tion. The latter is known as the type of i: strings i” whose symbols occur with 
same frequencies belong to a same type II); 

3. by Py the set of all types 1); 

4. by T (TI™) the subset of all strings i” € Q” with a same type 0”. 


The construction of universal codings is based on the following two bounds; of 
particular importance is the second one which states that the number of different 
types increases at most polynomially with n. 


Lemma 3.2.2 Let TI € P, be a type of 2”) and let H(TI™®) be its Shannon 
entropy. The number of strings in T (11) and the number of all possible types fulfil 


ATOY <2") | HP») Ss +1)". 
Furthermore, the a-priori probability of T(T1™) is such that 
yi CUCU) ee a ae 
where S(II™ , ma) is the classical relative entropy (see (2.94)). 


Proof Let P™ be the following empirical probability distribution on g”, 


a a 
POG) := I] pai yN GE) = I] np fli) logs pili) _ 9-2 Tiny) 
j=l j=l 


The probability of the type class T (11) is certainly smaller than 1; thus, 
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1> p(T) = Ð PEM) = #T 20) 
iMeT(M”) 


yields the first estimate. 

The second is a very loose upper bound: each type IT, in) is entirely characterized 
by how many times each symbol i € J, occurs ini. Without constraints (that can 
only decrease the number of types) there are n + 1 choices for each i = 1,2,...,a, 
namely 0, 1,..., n, whence the result. 

Finally, the last bound is derived as follows: first, notice that 


n a a 
. . y(n) (n) ° . 
oe )= | ea [| er =| eee 
t=] 


j=l l=1 


: ‘ ; qi 
9" Di (pe) 10g peli) — p(eli™) logy 7 | 


L (AT) Syn a) 


Then, using the first upper bound, 


Tay Fo pe") 
iMeT(™) 


= #(T0™®)) g—n(H(™) +S ra) < g-n SCI , ta) 


Because of the first bound in Lemma 3.2.2, at most nR + 1 bits are needed to 
encode the label of a string i ™ of type 1 with H(TI™) < R, while at most 
alog,(n+ 1) + 1 bits ensures the encoding of the label specifying the type P to 
which the string belongs (the +1 accounts for R and log, (n + 1) not being integers). 
Therefore, in the limit n — oo, one expects a compression rate R for all Bernoulli 
sources with H(A) < R. 


Definition 3.2.2 (Universal Codings) Let R > 0 and consider an encoding of 
a Bernoulli source A into binary strings of length |nR], given by €”: Qw > 
oF , followed by a decoding procedure D : oF RI 2), This gives a universal 
(n, 2”*®)-code if the probability of error 


Pe ana ap er a2") 


goes to 0 whenn — oo and E£”, D” do not depend on the Bernoulli source probability 
TA. 
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Proposition 3.2.3 There exist universal source codings (n, 2"®) for every Bernoulli 
source with H(A) < R. 


l 1 
Proof Given R > 0, let R, := R — (penti. using the first two bounds in 
n 


Lemma 3.2.2, the subsets A” := er € Qn : HMw) < Ra} have cardinalities 
such that 


{4)= >, aroas > oa 


n) EPn n) EPn 
HO™®)<Rpy HT) )<Rp 
< X 2” Rn < (n + 1)* gnRn aa gnk ; 
MM EPn 
A(T) <Rpy 


Let £” associate to strings in A” their label in the list of such strings expressed in 
bits and let D” be its inverse map. If H(A) < R, then, using the third bound in the 
previous lemma, 


P® =1— a? (A) = x n ( ra) 


n EPn 
H )>Rp 


(n+1)4 max{n{? (T1)) <: A(T) > Ra} 


IA 


—nmin} S , 74): H()>Ry| 
<(n+1)*2 | ^ . 


Since lim, Ra = R and H(A) < R, and the relative entropy S (m1 , 72) = O iff m1 = 
T2, P}. „ gets exponentially small for n sufficiently large. 


3.2.2 Channel Capacity 


Noiseless channels are an exception; usually, during transmission signals get dis- 
torted. It can thus happen that a channel outputs a same string y™ when presented 
with different input strings x™ and af? which cannot then be decoded without 
errors. Like in compression, to counteract distortion one resorts to suitable encoding 
and decoding procedures of longer and longer strings; however, unlike in compres- 
sion where redundancies are eliminated, in the presence of noise, the strategy is to 
introduce redundancies in order to lower the possibility that different input strings 


give rise to a same channel output. 


Example 3.2.5 In Example 2.4.1, bits 0 and 1 can be converted into one another 
with probability 0 < p < 1/2 by a binary symmetric channel C. The probability of 
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a wrong decoding can be lowered by encoding 


E(0)= 00.0 , E(1)= 11-1 


2n+1 times 2n+1 times 


Then, 2n + 1 uses of the channel output strings i 2n+1) .— CCn+) o £(i) that can be 
decoded by a majority rule: let N en+1) (0) denote the number of Os in i Qn+1) then 


; 0 if Nion} (0) > n 
Qn+1) _ j2n+l) 
D = | 1 if Njen) (0) < n 


By such an encoding-decoding procedure one transmits one bit at the cost of 2n + 


1 bits; an error occurs, that is D o C®”tD o E(i) Æi, if > n+ 1 bits of E(i) are 
2 1 

flipped by the channel C. The probability of such an event, ( ie 1 ) rp a-p), 
n 


vanishes with n — oo; unfortunately, the transmission rate, that is the number of 


1 
bits transmitted per use of the channel, 5 T vanishes, too. 
n 


In the following we shall consider channels C without memory and without 
feedback such that each of their uses is independent of the previous inputs and 
outputs. Further, n uses of the channel C amount to a single use of a channel 
C™ which maps input strings x € I y consisting of n symbols from an alpha- 
bet Jy = {1,2,..., nx} into output strings y™ € Ty consisting of symbols from an 
alphabet Jy = {1,2,..., ny}. Input and output strings are conveniently described 
as realizations of stochastic processes {X;};en and {Y;} en with join random vari- 
ables X := Vi; X; and Y™ := \V=1 Yj. The transitions x™ > y™ occur 
with probabilities p( y™ |x) that factorize (see (2.82)) and are thus completely 
characterized by the single-use transition probabilities p(y j\xi). 

One of the great achievements of early information theory was obtained by Shan- 
non who proved that codes exist such that the number M (n) of distinguishable strings 
x increases with n at a non-zero exponential rate R: M (n) © 28", 


Definition 3.2.3 ((Channel Codes and Capacity) [113]) A code (M, n), for a chan- 
nel C consists of 


1. aset Jc := {1,2,...M}; 

2. an encoding E : Ic + Ty associating a code-word x (w) = E(w) to any of the 
indices w € Ic; 

3. a decoding procedure D : I} +> Ic, D(y™(w)) =: Ô € Ic, that returns  € Ic 
given a channel output y™ (w) = C” (x™® (w)). 
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The rate of the code is defined by R := 
Diy (w)) # w, is 


enw)= E (yx (w)) 


yO Eny DY) Aw 


log, M 
282 "3 . The probability of an error, Ù = 
n 


The rate is said achievable if there exists a sequence of codes (2”?, n) with vanishing 
maximal probability of error en := MaXwelc €n (w). 
The capacity C of the channel C is the largest of its achievable rates. 


Remark 3.2.3 For a memoryless channel, (2.82) holds; thus, if the probabilities of 


the input stochastic process {X; };en factorize, so do the probabilities of the output 
stochastic process {Y;}ien: 


Prog = YP px) pwa) 


xer 
n n 
=||} rox px@) =[] rro). (3.33) 
j=l xj j=l 


Shannon’s result is that the mutual information (2.93) I (X; Y) is an achievable rate 
and that the channel capacity is given by 


C = max 1 (X; Y). (3.34) 
TX 
Examples 3.2.6 
1. Example 2.4.1.1: pg(i) = pa(i), i = 0, 1, implies 7(A; B) = H(A), whence 


capacity, C = 1, is attained at 74 = {1/2, 1/2}. 
2. Example 2.4.1.2: with H(p) := —p log, p — (1 — p) log,(1 — p), 


1 1 
I(A; B) = H(B) + È pati) È pili) log, pGli) = H(B) — H(p) , 
i=0 j=0 


whence capacity C = 1 — H(p) is attained at 74 = {1/2, 1/2}, since 


1 1 
PB) = pa 0) — p) + pad) p= 5? PB) = paO)p+ pad — p) = z` 


3 For sake of simplicity, in the following M = 2”? will be identified with [2”*1], the smallest integer 
larger than M. 
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3. Example 2.4.1.3: pg(1) = pa(0)(1 — a), pg(2) = pa(1)(1 — a) and pp (3) = 
a(pa (0) + pa(1)) = a yield 


I(A; B) = H(B) — (pa (0) + pa(1))H (a) 
= H(B)— H(a) — a)H (A) , 


whence capacity C = (1 — a) is attained at t4 = {1/2, 1/2}. 

4. The capacity in (3.2.3) refers to only one use of the channel C; consider now the 
channel C™ acting on x e I y with outputs yMe Ty. The mutual informa- 
tion 1(X; y ™) of the corresponding random variables X” and Y™ can be 
controlled by repeatedly using (2.93). From (2.82), 


HY |x™) = AX” V y™) = H(x™) 
= H (Y, |X™® v Y®70) + H(X® vVY@D) — H(X™) 


n n 
=} AIX v YOY) =} HY;IX;). 
j=l j=l 


Further, from (2.88), 


1(x™; y”) = H(Y®) = H(YY®|x™) 


<5 (nY) = H(¥jIX;)) <nC. (3.35) 
j=l 


Therefore, if C™ denotes the capacity of the channel C, the supremum over all 
input probability distributions gets C™ < nC. Actually, from Remark 3.2.3, 
equality is achieved by choosing a factorizing mym) such that pya) (x™) = 
I- 1 Px(x;), with myx the one achieving capacity C. Then, the output proba- 
bilities factorize too and thus H (Y ™) = nH (Y). 


The above relation between capacity and mutual information can be understood as 
follows. As showed in the last example, if X®™® consists of n independent, identically 
distributed repetitions of X, then the same is true of Y ™ and X™ v Y™ with respect 
to Y and X v Y. With H(X), H(Y) and H(X, Y) the corresponding entropies, based 
on the AEP, for large n there are roughly 2”) zy -typical inputs, 2”) zy -typical 
outputs and 2”4*\Y) jointly typical pairs (x, y™), that is typical with respect to 
mxvy. Of course, not all input-output pairs (x, y™) with x y-typical and y™ 
my-typical are jointly typical: this happens with probability roughly equal to 

gn (X,Y) 


= Qn (Xs ¥) 
QnH(X) 9nH(Y) ~ ` 


Therefore, in order to encounter a jointly typical pair with fixed output y“ one needs 
at least 2”/‘*:” inputs; in other words, one expects that encoding a number of input 
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strings smaller than 2”! Y), none of them should be jointly typical with respect to 
a same y™), Vice versa, more than 2”/(%*") inputs would start having a same jointly 
typical output and thus being not exactly identifiable. Memoryless channels with 
independent, identically distributed inputs are thus expected to have achievable rates 
R > I(X;Y). 

In order to give a mathematical proof of the above intuitive argument, we start by 
extending the notion of typical strings. 


Definition 3.2.4 (Jointly-typical Strings) Two strings x € I x and yMe Ty are 
jointly typical if they belong to the subset AM CI x X Ty such that 


l (n) 
[== logy pxe(x ) = H(X)| <E 
l (n) 
|-— log, prey) — HC] < € 
l o) y(n) 
|- = logs pxvyn(x +) )-H(XVY)| <e, 
where 0 < e < 1. 

Since (2.82) holds, the argument of the proof of Proposition 3.2.2 gives rise to a 
jointly typical AEP. Namely, lete > 0, for sufficiently large n’s the probability carried 
by subsets of strings violating any of the inequalities in the previous definition can be 
made smaller than €/3 so that Prob(A“”) > 1 — e. Moreover, its cardinality fulfils 


a _ €) gn(H (XVY)—e) < #(AM) < gn(H (XV) +e) , (3.36) 


while the probabilities of strings x, y™ and (x™® , y™) satisfying the inequalities 
in Definition 3.2.4 fulfil 


27H (X)+6) < pxn(x™) < 2 ~M(H(X)—6) (3.37) 

Q-MA(Y) +6) < pyn(y™) < 27H (Y)-6) (3.38) 

27n(H(XVY)+6) < peyra, y) < QnA (XVY)—6) (3.39) 

Then, Prob( (x, y®) e A™)) = Y pxn(e™) pyn(y™) can be 
a, ye A” 


bounded from below and above as follows: 


(d= 6) 270Z: < Prob( { (x, y) € AHN) < 2-7U%¥)-39 (3.40) 


Theorem 3.2.3 (Shannon Noisy-Channel Theorem) All rates R < C, C as in (3.34), 
are achievable and any sequence of codes (nR, n) with the maximal error probability 
en — 0 must have R < C. 
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Proof that e, —> 0 = R < C Suppose the signals w € {1,2,..., M}, M = eae 
encoded by (nR, n) into E(w) = x™® e I y» are equidistributed; let W denote the 
random variable with outcomes w. Using (2.93), (2.95) with C(B) = E(W) = x” 
and (3.35), it follows that 


nR < log, M = H(W) = H(wiy) + 1(W; e) 


< H(wiy™) + 1(x;¥) < A(wIy™) +anc. 


We need now connect H (wiv) to the error probability: this is done by means 


of the so-called Fano’s inequality. By assumption the maximal error probability in 
Definition 3.2.3 goes to zero with n, so does the average error probability e?” := 


1 0 =w : : ; 
u 5 en (w). Let E := | TEE E is a random variable determined by W and 


welc 
Y™ , Thus, using 2.91, 
(wir) = H(EIW, y) $ (wir) 


= H(WIE, z) + H(EIY®) , 


Now, from Remark 2.4.3.4, H(EIY®) < H(E) < 1. Further, E = 0 implies that 


W is determined by Y™ so that H(WIE = 0, y)) = 0, whereas if E = 1 then 
the cardinality of possible values of W is M — 1. Therefore, 


H(WIE, y = Ý Prob(E = iH (WIE = y") 
i=0,1 


< ef” log, (M — 1) < ef’ nR => H(wiy) <1 + enr. 


The result follows since nR < 1 + e® nR + nC implies et) >1- b — E 


which in turn implies that et) cannot vanish with n —> œ if R > C. 


The proof of the first part of Theorem 3.2.3 relies on the following steps: 


1. for w € {1,2,..., M = 2”®}, choose the code-word x™ (w) at random with 
probability p? (x) = []"_, pin (xi). This gives a random code of type (nR, n) 
M n 
with overall probability Prob(€) = I] I] Pin(xi(w)); 
w=li=1 
2. choose the symbols w at random with the same probability p(w) = M7!; 
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3. if C(x) = y™ and there is only one Ô such that €(w) = x” (ù) is jointly 
typical with y”), then associate with y™ the symbol w, otherwise declare an 
error. This gives a decoding map y” ++ D(y™) = D; 

4. an error is also declared if D(y™) = © # w andC"(E(w)) = y™. 


Proof that (nR, n) is achievable when R < C: Let eP (w) be the probability of 
an error relative to a random code € and ef) (€) the corresponding average error 


probability. Further, let 


M 
P(e) := X Prob(€) el) (£) = + Y Y Prob(£)e (w) : 
E w=1 € 


this is the average error probability over all randomly generated codes. Then, every 
w gives the same contribution to the error, so P(e) = )> g Prob(€ yey? (1) with fixed 


w= 1. Let Fy := {~™(w), yMJe An", where A” is a jointly-typical sub- 


space. According to the rules of the game, if y™ = C”(x(1)), a decoding error 
occurs when 


1. x1), y™) ¢ A”, that is when the input corresponding to w = 1 and the 
relative output are not jointly typical; 

2. (x (i), y™) €e F; fori # 1, that is when the output corresponding to w = 1 is 
jointly-typical with code-words associated to w Æ 1. 


The overall average error probability can thus be estimated as follows: 


M 


M 
P(e) = Prob((Fi)° uU Fi) < Prob((Fi)°) + X- Prob(F;) . 
i=2 i=l 


By the jointly-typical AEP , Fı C AP => Prob((F1)®) < e€ for n large enough. 
Further, because of randomness of the code, the input x” (i), i Æ 1, are statistically 
independent from x (1) and y = C” (x™ (1)). Then, the jointly-typical AEP also 
yields 


M 

X Prob(F;) < (M = 1)2 n(I(X;Y) 3€) < 2 n(1(X;Y) R 36) . 

i=2 
If R < I(X; Y) — 3e, the latter quantity gets < e for n sufficiently large and thus 
P(e) < 2e. This implies that there exists at least one code €* with el”) < 2e. By 
choosing for X the distribution 7* attaining capacity in (3.34), the condition for 
achieving the rate R becomes R < C. Finally, at least half of the code-words x (w) 
of €* must have e™ (w) < 4e otherwise el”) > 2e. Keeping only these ones, changes 
the rate from R to R(n) := R — 1/n. The procedure thus yields a sequence of codes 
(nR(n), n) such that e”) — 0 and R(n) > R forall R < C. 
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3.3 Classical Machine Learning 


Artificial Neural Networks have proven to be an extremely efficient computational 
model in specific tasks such as pattern recognition or image classification and have 
revolutionized the field of data analysis on classical computers [1, 152,168,319]. 
While in the previous sections we have been concerned with the manipulation of given 
information, in the present one, we shall focus upon its processing for classification 
reasons. Indeed, the successes of Neural Networks and Machine Learning on one side 
and the rapidly growing interest in the use of their techniques in quantum mechanical 
problems and in their adaptation to quantum mechanical scenarios on the other 
one, motivate a brief overview of the theory of artificial neurons also called simple 
perceptrons, in particular of their applications to sorting out input patterns according 
to a prescribed classification scheme [169,244]. 

An important result by Rosenblatt showed that when a classification problem can 
be solved, then a so-called McCulloch-Pitt perceptron reaches the solution in a finite 
number of adjustments of its internal parameters. However, not all problems can be 
solved by simple perceptrons; the ratio of the number of those that can be solved to 
the pattern dimension leads to the notion of perceptron storage capacity. One of the 
main open questions relates to the possibility that the capacity might be enhanced by 
quantum aided classical perceptrons or by thoroughly quantum perceptrons, issue 
that will be addressed later on in Sect. 7.8. In the following we shall briefly survey 
those aspects of the vast literature on Machine Learning with Neural Networks that 
are functional to the subsequent investigation of their quantum counterparts. 

Simple perceptrons mimick the functioning of a human neuron which, by 
exchanging electro-chemical signals with other neurons through axons, dendrites 
and synapses, processes incoming information and outputs new information. A 
McCulloch-Pitt perceptron consists of an operative unit capable of two states, labelled 
by 1 (active) and —1 (inactive) that are accessed by responding to a set of external 
incoming inputs, vectors in R^. With reference to Fig.3.1, a classical perceptron 


Bias 
b 
zı O——— U1 


Activation 
function Output 


Inputs T2 O > W2 ) >| f » y 


TETN O———> WN 
Weights 


Fig. 3.1 Single layer perceptron: schematic representation 
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admits a number of real vectors x” = (x/,...,Xn7) € RY y=1,2,..., p as input 
patterns. Given an input pattern x”, the perceptron computes an affine transformation 


xXH>Oz:=x-w+b, (3.41) 


with real parameters w = (w1, ..., wy) and b, called weights and bias, respectively. 
Subsequently, the perceptron evaluates on the output z an activation function f : 
R — R, eventually yielding an output y := f(z). 


Remark 3.3.1 There exist different possible activation functions, some being more 
computationally efficient like sgn(z), and others more biologically inspired as the 
hyperbolic tangent f(z) = (e* — e~*)/(e* +e %), or the sigmoid function f(z) = 
1/(1 + e77). In the following, we consider the activation function sgn(z). Also, for 
the sake of simplicity, we shall mostly work with b = 0. This is not much of a 
restriction; indeed, one can always move to R+! and include a bias by adding a 
component xo = | to patterns and wọ = b to weights. 


3.3.1 Classification Tasks with Classical Perceptrons 


A binary classification problem amounts to assigning an input pattern to one of two 
possible classes, indexed by a binary variable € € {—1, 1}, also called target. For 
sake of simplicity, we shall focus on binary patterns, 


T= {x" e{-1, 1)" -1<p= p}, (3.42) 


and denote with 


oe [e = {eh = +1} (3.43) 
the set of 2? possible classifications of the chosen p patterns. The pair (IT, €) can 
be referred to as a classification problem; choosing f(z) = sign(z) as activation 
function, the perceptron solves the problem when it correctly classifies the input 
patterns x” by finding a weight vector w € RY such that, for all v = 1,2,..., p, 


sign(w- x”) = €” © sign(A”) > 0, (3.44) 


where the quantities 


w.x” 2 
Ab i= gH » w= |} uF, (3.45) 


are called stabilities. 


Remark 3.3.2 If any weight vector w can be found fulfilling (3.44), then the per- 
ceptron has solved the classification problem (II, €) by linearly separating the input 
patterns associated to é = +1, respectively €= —1, into those belonging to the 
upper, respectively lower half spaces identified by the hyperplane in R^ such that 
w-x=0. 
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Let y” := sgn(w- x”) for 1 < u < p; if, given a particular target classification Ẹ € 
X, y” Æ E, one can try to correct the error by changing the weight-vector w, into 
w + 6“, where 


wo ee 
vadeg a u as SO 


with € > 0 a small control parameter. Indeed, 6” gives a positive increment to the 
negative scalar product ¿”w - x”: 


6. (E xF) = 2e (E? Ixl? = 2e xl? > 0. (3.47) 


This is known as Rosenblatt rule and is in fact the first example of back propagation 
as in the following result. 


Proposition 3.3.1 Ifa given classification problem (TI, €) has a solution, it can be 
reached in a finite number of steps. 


Proof Starting the classification problem with a weight-vector w = 0 and perform- 
ing R = D , r” adjustments according to the rule (3.46), where r” is the number of 
corrections relative to errors associated with the input pattern € and corresponding 
target component €", the resulting weight-vector is 


PpP 
WR = 2e% r ER yh 
p= 


Since wj = wj—1 + 2e€#x", for a suitable u, the norm of wr can be bounded from 
above through the telescopic sum 


R 


2 2 2 
wall? = >> (Iwj? = hw jal?) « 


j=l 


setting wọ = 0. Since the j-th adjustment is required when ¿”w j_, - x” < 0 and the 
inputs x” € R have been assumed to have components +1, one estimates 


2 a 2 > re) 
lwjt — Iwill” < 2e" wji: x” + 4e* |||" < 4e° N4. 


Then, 
weil < 2eNVR. (3.48) 
Setting 
w.x” 


A(w):= min ¿” 


ee (3.49) 
I<usp” ||wl| |x|] 
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0 < A(w) < 1 by the Cauchy-Schwartz inequality and ||x“|| = N. Furthermore, the 
assumed solvability of the problem implies the existence of a weight-vector w* such 
that A(w*) > 0. Then, the Cauchy-Schwartz inequality together with (3.48) and 
(3.49) yield 


* P * * 
wp- w 2Ne€ Yg -xt = 2NeRA(w*) > JRA"). 


~ wri well = N ||w*|| ~ [wall 


Thus, when the problem (TI, €) is solvable, the number of R = ae 1 r” up-datings 
of the weight-vector yielding the linearly separating hyperplane identified by the 
orthogonal vector w* is always smaller than 1/4? (w). 


Unfortunately, not all classification problems can be solved as shown in the following 
example. 


Example 3.3.1 In the case of binary logical functions, solving a classification prob- 
lem (II, €) by a perceptron amounts to computing them. By associating —1 with 
false and 1 with truth, the binary function fı := OR, f2 := AND map the four 
binary patterns 


O = {x! =(-1,-1), x? =(-1,1), x? =(1,-1), xt = (1, 1)} 


into the targets € = (—1, 1, 1, 1) and € = (—1, —1, —1, 1) according to the truth- 
tables 


fah =-1, AA=1, A@I=1, Aal, 
h&a =-1, hæ =-1, hæ) =-1, ha) =l. 


Denoting € = —1 with a black bullet and € = 1 with a white one, the corresponding 
problems (II, €) can be graphically represented as in Fig. 3.2. One can thus linearly 
separate the input patterns as indicated by the targets for the OR and AND functions; 
indeed, in Fig. 3.2, the diagonal lines x; + x2 + 1 = 0, respectively xj + x2 — 1 = 0, 
identify the hyperplanes in R? associated with the weight vectors w1,2 = (1, 1) and 
biases bı = 1, respectively b2 = — 1 for the OR, respectively AND functions. 

Consider instead the so-called XOR function which is true only on inputs where 
one and no more than one component is true, 


f(x!) =—-1, B@?)=1, BO?) =1, A@)=-1. 


In this case, the target € = (—1, 1, 1, —1) cannot be linearly implemented; instead, 
both hyperplanes implementing the OR and AND functions are needed to separate 
the patterns (—1, —1) and (1, 1) from (—1, 1) and (1, —1), as shown in Fig. 3.3. 
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T2 T2 


OR AND 
Fig.3.2 Single layer perceptron implementation of the OR and AND binary functions 
T2 
(1,1) 
Tı 
(1.-1) 


XOR 


Fig. 3.3 XOR problem 


The need of two hyperplanes for the separation of the patterns in the XOR problem 
and their choice correspond to the fact that the XOR logical function can be expressed 
in terms of the OR and AND functions as follows: 


XOR(x1, x2) = AND(OR(-11, =x); OR (x1, x2) l 


Correspondingly, the separation by the two hyperplanes can be implemented by the 
perceptron depicted in Fig. 3.4. It consists of two layers, the one to the left formed by 
two neurons and is called hidden, the one to the right only one neuron and outputs the 
result. The NOT operation (x1, x2) œ> (—x1, —x2) performed on the input pattern 
to the first neuron of the hidden layer is indicated by the dashed arrows in Fig. 3.4.4 


4 Notice that the NOT operation can be easily performed by a perceptron that accepts only one 
input x € R, multiplies it by —1, adding the bias b = 0.5. Then, sign(—x + 0.5) changes the sign 
tox =+1. 
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T2 


Fig. 3.4 Two-layer perceptron solving the XOR problem: the dashed arrows pointing to the first 
neuron of the hidden layer indicate that a NOT operation has to be performed on the input compo- 
nents 


The weights w1; = w12 = | and bias bj = b2 = 1 implement the OR function on 
the inputs (—x1, —x2), respectively (x1, x2), through 


sign(—xı — x2 + 1), respectively sign(xı + x2 + 1), 


while w; = w2 = | and b = —1 implement the AND function through 


sign(sign(~xı — x + 1) + sign(x1 + x2 + 1) — 1) . 


3.3.2 Storage Capacity 


We have seen in the previous section that, given p patterns in R”, there are, in line 
of principle, 2? ways to separate them into two classes indexed by +1. Each way is 
associated to a so-called dichotomy € = (i EF = +1, but some of them may 
need multi-layered perceptrons to be computed. We shall call linearly separable 
those dichotomies that can be implemented by a single hyperplane and denote by 
C(p, N) their number. 

For instance C(p, 1) = 2, for all p; indeed, when N = 1, any division of the line 
into two halves can only separate the cases where all the minuses are associated 
with patterns on one side and all the pluses with patterns on the other one. Also, 
C(1, N) = 2 for all N, for a single input pattern can always be labelled +1. 

On the other hand, when, as in the previous Example 3.3.1, p = 4 of the 24 = 16 
dichotomies possible for 4 input patterns, only C(4, 3) = 14 are implementable, 
where N = 3 since the perceptron needs a bias b Æ 0 and this augments to 3 the 
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dimension of the space of patterns (xo = 1, x1, x2) (see Remark 3.3.1). These cases 
are particularly simple instances of the following general counting argument. 


Theorem 3.3.1 ((Cover’s Function Counting Theorem) [112]) Suppose p patterns 
x! e RN are in generic position; namely, for d < N, no more than d + 1 of them 
lie on a same d-dimensional hyperplane. Then, the number C(p, N) of linearly 
implementable dichotomies satisfies the recurrence relation 


C(p, N) =C(p—1,N)+C(p—-1,N—-1) 
and is given by 


N-1 


— 1 
com=25 (7) k where m<k=> (7) =0. 


Proof When adding a point x? to the patterns x!,...,x?7! in such a way that 
x!,...,x?7!, x? still be in general position, the hyperplanes implementing the 
C(p — 1, N) linearly separable dichotomies for x!,...,x?7! divides into Dı ones 


which pass through x? and D2 ones which do not. Because of the general position 
assumption, a suitably small shift of any hyperplane in the first class provides two 
linearly separable dichotomies for x!,...,x?~!,x?. There are C(p — 1, N — 1) 
such hyperplanes for x!,...,x?~! since they are constrained by x? belonging to 
them, decreasing their effective dimensionality. Instead, each one of the hyperplanes 
in the other class identifies only one dichotomy for x!,..., xP! xP. Therefore, 


C(p, N) = 2D; + D2 = (Dı + D2) + Dı = C(p-1,N)+C(p-1,N-1). 


By iteration one finds a Newton binomial expansion 


p-1 yl N-1 p=l 
civ. N= ( : )can-o=25( 7 ). (3.50) 
k=0 k=0 


where it has bee used that, as already observed, C(1, N) = 2 for all N > 1, while 
Cd, N) =O for N <0. 


Besides easily checking that C(p, 1) = C(1, N) = 2 for all p, N > 1, using 
(3.50) one also finds that p < N yields C(p, N) = 2?, while 


2N-1 2N-1 2N-1 
= = — 9p-1 
cen. = D ( k ) pn ean >| k )=2 i 


One thus sees that, by increasing p the ratio of the percentage of linearly imple- 


C(p, N 
mentable dichotomies, T decreases. 
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In order to better inspect the behaviour of the ratio of the number of linearly 
implementable dichotomies to their total number when the number of patterns, p, is 
larger than the patterns’ space dimension, N, it proves convenient to set p = a N and 
let N become large. Then, the Gaussian approximation to the binomial probability 


_ _ 2 
e — "eta -N v 1 exp ( (k —aNx) ) 
k J27raNx(1 — x) 2aNx(1 — x) 


which holds for 0 < x < 1 and aN,k > 1, can be used. Choosing x = 1/2, the 
cumulative probability can then be approximated by a Gaussian integral: 


C(aN, N) 


P(aN,N) 7 


(3.51) 


i? eae 
-È me (aa) 
~ fe r(-z(: a v) 
TE 2 2 


VN(1-$) 1 1 
Bey 2\ Noto }1l...a<2 
em wey =) > r (3.52) 


where the lower value k(N) in the summation is chosen to be an integer which 
diverges with N slower than \/N, the discarded finite sum becoming negligible 
when divided by 2°^71, 

The above considerations naturally leads one to introduce the notion of storage 
capacity, denoted by œc, as an answer to the question: Which is the maximal number 
of patterns p* that can be stored reliably, given the number N of their components? 
Basing on Cover’s counting argument, the situation summarizes as follows: if the 
number of patterns is less than N, then with high probability they are in general 
position and all of them can be reliably classified, whereas, if there are p = aN > N 
patterns the number of those which are reliably classifiable shrinks to zero with N 
as soon as a > 2. 


3.3.3 Storage Capacity: Statistical Approach 


In the statistical approach to the storage capacity developed by Gardner [146-148], 
both input patterns and targets are considered as independent and identically dis- 
tributed random variables. This opens the way to the use of powerful statistical 
techniques of spin-glass theory; indeed, the possibility to find a separating hyper- 
plane for randomly labeled patterns belongs in fact to the class of random constraint 
satisfaction problems [142]. In this approach, the critical parameter a gives rise to a 
phase transition in the high-dimensional case, and the pattern capacity is determined 
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by the critical value a, separating the SAT-phase, for a < œc, where it is possible 
to satisfy all the constraints, i.e. classify all the patterns, from the UNSAT-phase, 
a > Qc, where the minimum number of unsatisfied constraints is larger than zero. 

From such a statistical perspective, the optimal storage capacity of a perceptron is 
defined basing on estimating the volume of weights w € R” of definite length || w|| 
that correctly classify the input patterns. For stability reasons, it proves convenient 
to strengthen condition (3.44) by means of a threshold k > 0, which amounts to 
requiring that the stabilities in (3.45) must satisfy 


A > w=l,...,p. (3.53) 


Remark 3.3.3 From a practical perspective, the non-vanishing lower bound to the 
stabilities yields a reduction of the probability of classification errors due to wrong 
input patterns. Roughly speaking, the presence of noise can slightly alter a given 
input pattern x” so that the corresponding A“ may randomly change from positive 
to negative; then, k = 0 would wrongly classify it, while x > 0 would partially 
attenuate such a shortcoming by excluding those patterns x” such that —« ||w|| < 
w- x” <+K||wll. 


Due to the definition of the stabilities A” in (3.45), the validity of the conditions 
(3.53) is independent of the norm of the weight vector w. Thus, one can suitably 
fix it and compute the volume occupied by weights that satisfy the conditions (3.53) 
relative the volume occupied by weights of fixed norm. Choosing the latter to be 
such that |w]? = N corresponds to considering w whose components are O(1)). 
Then, the relative volume is concretely written as: 


> 1 7 
ve (tx", ey) =z. Í, dw ô(llwl? — N) | | 9 (4A! —«) (3.54) 
L= 


where the superscript © stands for classical in order to distinguish it from an analogous 
quantum relative volume in weight-space that will be introduced in Sect. 7.8.3, while 
© denotes the Heaviside step function, © (x) = 1 for x > 0 and zero otherwise, 


N/2 yN/2-1 27e)N 
Gi I dw ô(llwl? — N) = Z aa a (3.55) 
im T(W/2) 4nN 


is the reference volume of weights contained in the hypersphere of radius JN, 
T(z) is the Gamma function and the large N behaviour follows from the Stirling 
approximation. 

The expression (3.54) resembles a micro-canonical partition function; in analogy 


with statistical mechanics, the relevant quantity is actually log Vx, (tx, Pa) 
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rather than Vy (tx, ERa) itself, of which we will compute the average 


(log Vx, (tx, £ ae 1) ) with respect to the identically and independently distributed 


stochastic variables r xi, with u = 1,2,..., p,and j = 1, 2, ..., N; namely such 
that 
a E Prob(é#) = = , Prob(x! -xP ) = Prob(é!, ¿€ Ba 
rob(x}}) = Prob(£") = 5 , Prob(xj,xj,---xP) = Prob(é!, €,--- , £P) = 5. 
(3.56) 


The average can be computed by means of the so-called replica trick: roughly speak- 


ing, one uses 
(vs (wer) oe! 


oes (0%), «=m, 
pe tea((Vi Aa ) he , 857 


x 
evaluating (v; (a, ia Ay i )) for x € N, which will then amount to the num- 
xE 


ber of statistically independent and identically distributed replicas of the given sta- 
tistical ensemble, and then setting x = 0 [245]. Using saddle-point methods, under 
the assumption of a replica symmetric scenario, Gardner [147] showed the existence 
of a critical value of a, given by 


oo —1 
acts) =| f ee Pa + w] (3.58) 


=j T 


such that for a < a¢(k) the following limit holds: 


(in Vi (tr, ev i Dee 


dt 2 
Pon N = OK er fa ae “oe 
p/N=a 
at") q I 
x In} l1—© + + ; (3.59) 
| ( vl—-q 2(1-—q) 2 
where 
P(x) := -f et de, (3.60) 
Vin 


while for œa > @¢(k): 


lim (mvg (w. E= be SN (3.61) 
ies a 
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The replica method introduces several order parameters, the most important one 
being the average overlap of two randomly chosen weights w., and ws in different 
replicas: 


N 
1 
dns = 5 > wis . (3.62) 
j=l 


In the replica symmetric ansatz it is assumed that for the solution of the saddle point 
equations the average overlap is the same for each pair of replicas, i.e. ¢5 = q for 
all y Æ ô. Notice that by increasing the ratio p/N, the number of weights satisfying 
the storage condition (3.53) diminishes, hence their average overlap increases. The 
critical value of a, is then obtained in the limit of maximal overlap q —> 1. 


Remark 3.3.4 Gardner’s approach can be rigorously formulated [318,349], by using 
that the random variable NT! In VÄ (tx, EMF} i) is self-averaging; namely, devi- 
ations from the average NT! (in Vy (1x, gil) become vanishingly small in 
x, > 
the limit N —> oo. In other words, the average NT! (in VÄ (tx, eai isa 
a XE 
good representative of what happens for almost all realizations of the random vari- 
able N~!In V (tx, EF} Lal each realization corresponding to a particular choice 


of the random patterns and classifications {x", £} ti 


According to such an observation, equations (3.59) and (3.61) can be interpreted as 
follows: for œ < a,-(«) the relative volume of weights that are able to correctly clas- 
sify arandom choice of the patterns shrinks as Vx, (r, EM} 2 1) ~ exp(F(a, K) N), 
Indeed, notice that, from its definition in (3.59), the exponent is negative. Instead, 
above the critical value, that is when œ > a;-(k), the relative volume is more than 
exponentially vanishing with N, i.e. V = o(exp(—cN)). 

This transition between two so drastically different behaviours of the relative vol- 
ume of weights at the critical value a,(«) reminds of the transition of the percentage 
of correctly classifiable patterns from 1 to O at large N when a = p/N = 2 that 
occurs in Cover’s counting result. Indeed, at x = 0, (3.58) yields a(x) = 2. It is thus 
meaningful to identify œc (x) as the critical storage capacity of a simple perceptron. 
A detailed derivation of (3.59) will be given in Sect. 7.8.3 for the critical storage 
capacity of a quantized version of the classical perceptron from which (3.59) emerges 
in the classical limit. 


3.3.3.1 Bibliographycal Notes 

Most of the results about the KS entropy have been drawn from [77]; other excellent 
books on the subject and its applications are [18,111,197]. In particular, the com- 
pleteness of the KS entropy for Bernoulli systems is discussed in [111]. The book 
by [239] is a reference for Pesin’s theory and the relations between the KS entropy 
and Lyapounov exponents (see also [128]). 
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In [78] one finds an extensive review of the role of KS entropy and Lyapounov 
exponents as regards the issue of predictability in continuous and discrete dynam- 
ical systems. In [19], the notion is discussed in relation to the broader notion of 
complexity. 

The material on coding and compression has largely been drawn from 
[113,300,313]. 

Section 3.3 is mostly based on the book [169] which provides an exhaustive review 
of neural network techniques and of the statistical mechanics approach to them. A 
more focussed introduction to the statistical approach to neural networks can be 
found in [244]. 
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Algorithmic Complexity 


One of the intuitive notions which is most elusive from a mathematical point of 
view is that of randomness. Consider a string i” e€ 923 emitted by a Bernoulli 
source with probabilities po,1; suppose that n >> 1 and that the number of Os, n(0), 
is nearly half the number of 1s, n(1) ~ 2n(0). One expects that, generically, the 
relative frequencies n(i)/n tend to the probabilities p; with increasing n; indeed, 
only special, that is intuitively non-random, strings should fail such a statistical test. 
Therefore, one would call i random only if po = 1/3 [358]. Of course, passing 
the frequency test is not enough; indeed, if pp = 1/2, both i” consisting of n/2 
subsequent pairs 0, 1 and a string j” of Os and 1s distributed without any evident 
pattern occur with probability 27”. However, because of its regularity, i” would be 
called non-random and, vice versa, because of the absence of regular structures, j™ 
would be called random [113,366]. 

Presence and absence of patterns seems to be a useful clue to defining which 
strings or sequences are random and which are not so; this property should somehow 
be related to the degree of compressibility so that one might wonder whether the 
entropy rate introduced in Chap. 3 could provide a natural measure of randomness. 
Also, by replacing the entropy rate with the dynamical KS entropy, one could define 
a classical dynamical system to be random or not on the basis of the compressibility 
of the best ones amongst its symbolic models. However, entropy rate and the KS 
entropy describe the average behavior of sources or of dynamical systems and say 
nothing about individual strings or individual trajectories. 

Various attempts have been undertaken to tackle the problem of formalizing the 
intuitive notion of randomness of individual sequences i € 22. In [358], three rele- 
vant approaches are discussed: in the first one, randomness is identified with stochas- 
ticness, that is with the impossibility of devising a winning strategy when bets on the 
value of the next symbol i, of i € 25 are based on the knowledge of iji? -+ + iy 1. 
In the second approach, randomness is identified with chaoticness that is with the 
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absence of regular patterns ini € (22. In the third approach, randomness in a sequence 
i € (22 is identified with its typicalness, that is with the fact that it does not belong 
to any effectively null subset of 25 ? 

In the following we shall focus on the second approach which is also known 
as algorithmic complexity theory, and was developed independently and almost at 
the same time by Kolmogorov [207,208], Chaitin [96] and Solomonoff [334,335] 
in the early sixties. Algorithmic complexity theory involves as many subjects as 
mathematics, logics, computer science and physics [92,300,366]: we shall give a 
short overview of some of its aspects that are relevant for an extension of this notion 
to quantum dynamical systems. 


4.1 Effective Descriptions 


The main step towards a theory of randomness of individual strings was the observa- 
tion that regular strings admit short effective descriptions, whereas irregular strings 
do not. By effective description of a (binary) target string it is meant any algorithm 
(binary program) that is computed by a suitable computer and makes it halt with the 
target string as output. 


Example 4.1.1 Any string i = i, iz - - - i, consisting of n bits can always be repro- 
duced by processing the program 


PRINT iji2---in, 
which specifies the bits to print, one after the other. 


This program amounts to the literal transcription of the target string. Clearly, one 
has to seek more clever ways to describe i, that is shorter programs. In doing 


' Let 25, the set of all binary sequences, be equipped with the -algebra generated by cylinder sets 
and with the uniform product probability distribution so that any cylinder Cj indexed by a string 
i € QX of length €(i) has probability (Cj) = 2-*® A subset A C 825 is a null subset if for any 
€ > Othere are cylinders Ci, ij € 23 such that A C (J; Ci, and `; 2-4) < e, A subset A C 25 
is an effectively null subset if the previous inequality is satisfied with the strings i; that index the 
cylinders and £ > 0 (any rational number) both effectively computable by a suitable algorithm 
(for instance by a program processed by a computer) [358]. Intuitively, random sequences cannot 
be effectively reproducible and thus cannot belong to effectively null sets. Concretely, these latter 
sets consist of non-typical strings and correspond to effective statistical tests or Martin-Léf tests 
that, when failed, identify these non-random strings (an example is the frequency test mentioned in 
the discussion prior to this remark) [366]. In other words, a sequence is random according to the 
typicalness criterion if it passes all Martin-Lof tests. On the other hand, if typicalness were defined 
with reference to all possible null subsets, then there would be no typical sequences; indeed, any 
i € Q belongs to the null subset of (22 consisting of the sequence itself. 
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so, one is much helped by the presence of patterns; if i; = 0 for all 1 < j < n, the 
following simple program could be used: 


PRINT 0 n TIMES. 


For large n, the length of such a program goes as log, n, that is as the number of bits 
necessary to specify the length of the string €(i) = n. This is also the case if, less 
trivially, the string i” consists of a same pattern, i that repeats itself ~ n/q times. 
Indeed, what is to be specified is the length of the pattern at the cost of a fixed number, 
log, q, of bits and the number of repetitions at the cost of ~ log, n/q ~ log, n bits 
forn >q. 

On the other hand, if i” shows no pattern, there is no shorter effective description 
than literal transcription. In this case, the length of the effective description grows 
as n and not as log, n. 


In the previous example, it is clear that one is interested in the shortest possible 
effective descriptions s(i) of a given string i: let C) denote the length of 
any of these shortest description, that is (s(i”)) = C(i™). 

The map i” > s (i) is code for the ensemble of strings of length n. In Sect. 4.3, 
it will be showed that, by processing the effective descriptions by means of partic- 
ular computing devices called prefix machines (in which case C(i™) is denoted by 
Kda™)), the code becomes a prefix code (see Definition 3.2.1), so that the extended 


Kraft inequality (see Example 3.2.2) applies 


5 27K® <1., (4.1) 


. * 
ie Qs 


Example 4.1.2 (Payoff Functions [144,366]) Suppose the government of a country 
claims that in the j-th one of n successive elections it won with 0.99i; percent 
of the votes, i; being any decimal digit for j odd and the j/2 digit in the decimal 
expansion of 7 for j even. To defend itself from the accuse of fabricating the electoral 
results, the government replies that the probability Q (i) = 107” of sucha string of 
decimal digits i”) = i;i2--- in is equal to that of any other string randomly obtained 
according to the uniform probability distribution over 10 symbols. This defense can 
be defeated by using the regularity of i to construct a suitable payoff function 
t(i”|Q) > 0, namely a non-negative function whose mean value is such that 


>. 10-"4G™|O) <1. 


s(n) (n) 
IMEQi5 


Its meaning is as follows: the accuser proposes the government to be payed t (i | Q) 
upon betting 1 on the outcome i”. This is a fair proposal for, if the outcomes i” are 
distributed according to the uniform probability Q, the accuser average gain cannot 
be higher than 1. 
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However, if there is a pattern in i, the accuser can construct a payoff function 
t(i” |Q) that assumes high values on the strings with such a pattern. Concretely, for 
the half of the decimal digits of i that are randomly distributed according to Q, one 
needs n/2 log, 10 bits for its description; instead, for the remaining half that comes 
from an algorithm that computes successive approximations to 7, a finite number, 
C, of bits? suffice. Then, one gets the following upper bound to the length of the 
shortest effective description computed by a prefix machine (see previous remark), 


Ka”) < = log» 10+ C. 


Setting t(i™ |Q) := 272 2a) — KE) _ 10” 27K0™) one defines a payoff func- 
tion; indeed, because of (4.1), 


D QG) 278 od) -KG™) _ pD Pe <], 


i” enw i” ea 


While any fair Casino’s owner should accept bets based on such a payoff function, 
the government cannot; indeed, by betting 1 on the digit of each one of n successive 
elections, the accuser will pay n to the government but receive 10”/72-© from it, 
quite an amount of money for large n. As the payoff function does depend only on 
the presence of a pattern, but not on its particular form, the accuser strategy does not 
require any a priori knowledge. 


The aim of algorithmic complexity theory is an objective characterization of the 
randomness of individual strings in terms of the lengths of their shortest effective 
descriptions. It is thus necessary to eliminate the dependence of such lengths on 
the computers that process the corresponding programs. Indeed, given a same target 
string i” two different computers 331.2 will in general provide shortest descrip- 
tions 512i) with different lengths C 120). As explained in Proposition 4.1.1, 
this problem is overcome by resorting to effective descriptions processed by uni- 
versal computers, namely by computers that are able to simulate the action of any 
other computing machine. The universal computers on which classical algorithmic 
complexity theory is based are the so-called Universal Turing Machines (UTMs). 


4.1.1 Classical Turing Machines 


A Turing Machine (TM ) is a very basic (and abstract) model of computing device 
(see [366]) consisting of 


? This number becomes negligible when n increases. 
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1. a bi-infinite tape T subdivided into cells labeled by integers i € Z, each cell 
containing either a blank symbol # or a symbol o from a given alphabet x. We 
shall set © = £ U #; 

2. a reading/write head H moving along the tape that, when positioned on the i-th 
cell, reads the symbol g; € £, leaves it unchanged or changes it into o; € X and 
then proceeds to either the cell i + 1 to the right (R) or to the cell i — 1 to the left 
(L); 

3. a central processing unit C (CPU) capable of a finite number of control states 
qi € Q := {q0, q2 ---,4|Q|-1}: at each computational step, the CPU state q € Q 
may remain the same or change into q’ € Q. 


The list of possible moves defines a program for the TM ; formally, it amounts to 
a transition function 


ô:QXEmQxEx {L,R}, ôlq, o) = (q',o',d), de {L,R}. (42) 


As a consequence, any TM can be identified by the set of rules defining 6. Each 
set of rules, that is any TM , corresponds to a certain task, a computation, to be 
performed on an input data string. Any computation can be assumed to start with the 
CPU control state in a chosen ready state q,, the head positioned on a chosen 0-th 
cell and the input written on a finite number of cells extending from the 0-th one to 
its left, while all other cells to the left and to the right contain blank symbols. The 
computation then proceeds through a sequence of steps dictated by the transition 
function ô, each one of them corresponding to a certain configuration of the TM that 
performs it. 


Definition 4.1.1 (TM configurations) At each step of a computation a classical 
configuration c of a TM 4 is a triplet 


C>c:= (a {oiliez k) € Q x £7 xZ, 


where in the infinite sequence {0;}iez of cell symbols only finitely many of them 
are such that o; 4 #, while q, k denote the state of the control unit and of the head 
position and C the set of all configurations. 


In order to determine when a computation terminates, we assume that among the 
control states there is a special state, qf, such that when the control unit is in the 
state q f, then the output is read off from the position of the head to its right until the 
last o; Æ #. 

Because they consist of a finite set of rules involving finite sets of symbols, 
transition functions (and thus TMs ) can be encoded and numbered. Given a program 
p (or the TM which computes it), its number y(p) in the enumeration of all programs 
(or TMs ) is known as Gédel number of p [114]. A universal Turing machine is any 
TM 4 which, upon receiving the code of a TM %9, is able to simulate %9 on any input 
string. 
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Example 4.1.3 There are many possible ways to encode a transition function 6; a 
simple one is as follows [156]: the control states q; and the symbols ø ; are identified 
by giving their positions i, j in the respective lists Q and X. These are then encoded 
as strings of as many 0’s: 


qi œ 0':=00---0, oj ++ O :=00---0. 
i times j times 


Thus, the rule ô(qi, oj) = (qk, ce, d) can be encoded as 0! 10/10 10°10", 
where the 1s are used to separate the various entries (only sequences of Os are entries 
corresponding to labels). These appear one after the other as they do in the given 
rule, while n(d) = 1 if d = L, n(d) = 2 if d = R. Then, the transition function (or, 
equivalently, the TM Y that performs the task specified by it) can be encoded as 


10/2! 110! 11 08 107 1010% 10°) 11 
— m 
ist rule 
o} 1010210210% 11 
— qq 


2nd rule 


gin 10% 10%" 10° 1 0” m) 111 3 
n_n 


last rule 


where the first two strings of Os encode the total number of control states and of 
symbols, the pairs of 1s separate the rules, while the first and last 1 mark the beginning 
and the end of the list. 


Suppose f : N > N is a function from the integers to the integers; by passing 
to the binary representation of n € N, f becomes a function from QF +> 927. It is 
called total if its domain of definition is the whole of 23 (symbolically, f aim) | 
for alli € 925), partial otherwise, namely if there exist strings i on which f is 
not defined (symbolically, fi”) 4 on these strings). The existence of an algorithm 
or an effective procedure which allows one to compute f provides an intuitive and 
informal definition of computable functions; among others, a possible formalization 
of computability is as follows [114]. 


Definition 4.1.2 A partial function f : 23 ++ 7 is said to be computable if there 
is a Turing machine that on input i € 2% outputs f (i). 


The so-called Church-Turing thesis asserts that the intuitively and informally 
defined set of computable functions coincides with those that are computable accord- 
ing to the previous definition [114,156]. It is not a theorem, yet it could not be dis- 
proved as a conjecture; therefore, it is commonly accepted that the TMs provide a 
computational model which computes all what can be thought of being intuitively 
computable. 
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Remark 4.1.1 Given a computable partial function f, if pp is one of the (infinitely 
many) programs which compute it, one can assign f the Gödel number q(p f) of ppf 
which is one of the (infinitely) many Gödel numbers of f [114]. It follows that the 
computable functions form a countable set; this fact allows the use of Cantor’s diag- 
onal argument to construct a total function f : Q3 +» QF which is not computable. 
In order to show this, consider the enumeration as ¢; : N +> N of all computable 
partial functions f : N > N that can be constructed by choosing a definite Gödel 
number for each one of them. Then, the function defined by 


_ font) +1if dn) J 
ain) ={ O if daln)t 


is total as ¢ | on all inputs. Furthermore, it cannot coincide with any ¢; for, if d; is 
defined on j, then 6(j) = j)(j) +1 4 oj; (J). 


Example 4.1.4 An important class of TMs are the Probabilistic TMs (PTMs ) which 
provide a more powerful classical model of computation than TMs [156]. They are 
defined by transition functions of the form 


ô:QxExQxEx {L,R} | [0,1] (4.3) 
(4, 0;q',o',d) > 8lq, 0; q',o', d) € 0,1], > dq,0;q',0',d)=1. (4.4) 


q'.a'.d 


Namely, PTMs are defined by assigning the probabilities 6(q, 0; q', 0’, d) with 
which the machine goes from a CPU control state q € Q and symbol read o € & 
to a new control state g’, new symbol o’ together with a subsequent head move 
d € {L, R}. Therefore, given a starting configuration c; € C the machine will move to 
anew configuration c; € C with a certain transition probability pij := p(ci > cj), 
the successors of c; being all those c; with p;; 4 0. The transition probabilities sat- 
isfy a j Pij = 1; indeed, given a starting configuration c;, the PTM will surely move 
to a subsequent one among those available to it. Each step performed by a PTM will 
then be described by a transition matrix t = [pij]. 

Any computation performed by a PTM on an initial configuration co can be seen 
as a tree whose nodes are the successor configurations and the branches connecting 
the leaves carry the relative non-zero transition probabilities. Each run of the machine 
defines a tree-level with its corresponding nodes; if a successor configuration at level 
j appears more than once then the probability of its occurrence at that level is the sum 
of the probabilities leading to it through the various branches. As a simple instance 
of such a mechanism [156], consider an initial configuration co branching into two 
different configurations c,; and c42 at level 1 with probabilities po; := p(co > c11) 
and po2 := p(co > c12): poi + poz = 1. During the second step of the computation, 
the two configurations at level 1 branch into two configurations each: c11 into c21 
and c22 with probabilities p11; := p(ci1 — c21), respectively pı2 := p(c11 > C22), 
such that pi; + pi2 = 1, while c12 branches into c23 and c24 with probabilities 
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Fig. 4.1 Probabilistic turing Co 
machines: level tree 


C21 C22 C23 C24 


P23 := p(ci2 > c23), respectively p24 := p(ci2 > c24), such that p23 + p24 = 1 
(see Fig. 4.1). Thus the probabilities of the four configurations are 


P(c21) = poi Pit, P(C22) = poi Pi2, p(c23) = Po2 p23, P(C24) = po P24 - 


If c22 = c23 = c* then the probability of c* is p(c*) = p(c22) + p(c23). 


Remark 4.1.2 Within the class of PTMs , TMs are deterministic in the sense that the 
corresponding probabilities ô(q, 0; q’, 0’, d) equal 1 when the couples (q, o) and 
triplets (q', o’, d) are connected by the rules (4.2), otherwise 6(q, o; q’, a’, d) = 0. 
The computations performed by TMs correspond to deterministic classical processes, 
while those of PTMs correspond to stochastic classical processes (compare the bal- 
listic and Brownian computers discussed in [67]); in other words, it is the laws of 
classical physics upon which the models of computations embodied by TMs and 
PTMs are based. 

PTMs are important from the point of view of the so-called computational com- 
plexity? [156,198]. All computational tasks need a certain amount of time to be 
performed and use a certain amount of memory (space); roughly speaking, compu- 
tational complexity theory estimates how the amount of time and/or space required to 
perform a computation involving n bits scales with n: if the time required to process 
n bits goes as n“, a > 0, one says that the computation has polynomial computa- 
tional complexity, otherwise superpolynomial or exponential. When a computer U 
simulates another computer %9 that performs a certain task, there is an unavoidable 
overhead in space/time resources due to the simulation. The latter is then called effi- 
cient if the overhead scales polynomially with respect to the space/time resources 
used by Y. The Classical Strong Church-Turing Thesis [198] states that. 

Any realistic computational model can be efficiently simulated by a PTM . 

Namely, any computational model which is consistent with the laws of classical 
physics and which accounts for all necessary computational resources* only requires 
a polynomial space/time overhead to be simulated by a PTM . As the Church-Turing 
thesis (see Remark 4.1.1), also the strong Church-Turing thesis has survived all 


3 To be distinguished from the descriptional complexity. 
4 The adjective realistic refers to the fact that the time and space resources effectively needed should 
be explicitly declared [198]. 
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attempts to disprove it; however, as observed by Feynamn [139], this paradigm does 
not seem to be extendible to computational models based on quantum mechanics, for 
then classical physics appears unable to simulate their performances as efficiently. 


4.1.2 Kolmogorov Complexity 


In the following we shall restrict to the effective description of binary strings; using 
the notation of the previous section, we shall therefore consider TMs with the alphabet 
x = {0, 1} U #. Further, £(p) will denote the length, that is the number of bits, of a 
program p written as a binary string and U(p) the result of p being processed by a 
T™ 4. 


Definition 4.1.3 (Kolmogorov Complexity) The Kolmogorov complexity [113,366] 
or plain algorithmic complexity of i € 2{” is the length of the shortest binary 
program p such that U(p) =i”: 


Cui) = minf e(p) + Mp) =i] 


Plain algorithmic complexity is thus seemingly related to the most efficient way 
individual strings can be compressed; indeed, by the previous definition, no effective 
description of a given string i” can be shorter than programs with length equal to 
its algorithmic complexity C(i). 


Remark 4.1.3 Unlike computational complexity (see Remark 4.1.2), algorithmic 
complexity is not concerned with the space/time resources needed to process certain 
programs, but only with their lengths, without restrictions on time and memory. 
From the algorithmic point of view, only random strings are interesting, while those 
with simple effective descriptions are somewhat dull, despite the large amount of 
resources that may be needed to compute them. Indeed, there might be short effective 
descriptions that require a very long time to yield their targets, as for instance the 
DNA-encoding of human beings [366]. The attempts to fill this gap by considering 
algorithmic and computational complexity together has led to the notion of logical 
depth [366]. 


5 We shall conform to the notation of [366] which uses the letter C for the Kolmogorov complexity 
and K for the prefix complexity (see Sect. 4.3). 


128 4 Algorithmic Complexity 
Proposition 4.1.1 The following properties hold: 


1. The plain algorithmic complexities of a same string i™ with respect to two dif- 
ferent UTMs\,.2 differ by a constant which does not depend on the string, but 
only on the UTMs. 

2. The plain algorithmic complexity is upper bounded as follows 


Cyi™) < A+ LG) = Atn, (4.5) 


where A is a constant which does not depend oni. 
3. The number of strings i" € Q™ with plain algorithmic complexity strictly 
smaller than c > 0° is bounded by 


Hi E 2 : CyG™) < c} <2°-1, (4.6) 


Proof The proof of the first statement follows from the fact that 4 can simulate U2 
and vice versa, for both are assumed to be universal. Given i, let př be such that 
Cu aG™) = £(p{) and let P}2 be the program, of length £(P12) = L12, which allows 
4h to simulate 4 . In order to make Uy simulate 4 on the input Pi the programs P12 
and pf must be put together in way that £1; knows when the simulation instructions 
end and the string to be processed starts. This is achieved by concatenating P12 and 
př as q = p}3( P12), where 


i) = iiiz- in > BOM) = iniyiniz + ininO1 


is the encoding of a string obtained by repeating each of its bits twice and marking 
the end with a the pair of different bits 01: for this encoding one needs €(3(P}2)) = 
2 (L12 + 1) bits. In this way UU, will first read O(P12) being thus able to simulate 
Mı on the subsequent portion pj of the program q. Therefore, from the definition of 
plain complexity, it follows that 


Cy i) < Lq) + A < p=) +2(Lyot) + A< Cy G) +A. 


Reversing the roles of U2 one gets Cy, GM) < Cy, Gi) + A21; thus, 
Ie My Ga”) —C te a™)| < A, where A is a suitable constant which does not depend 
on the input i”. 

The first upper bound follows as in Example 4.1.1, from the effective description 
which tells 4 to print the bits of i one after the other. 


© If c is not integer, c is to be understood as |c], the largest integer not larger than c: |c] < c < 
le] +1. 
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The second upper bound follows because the number of binary programs with 
length smaller than c equals the number of binary strings with |c] — 1 digits at the 
most, whence 


[c]-1 
#{p : Up) < c} = > oO 21 29a, 
j=l 


Example 4.1.5 In order to improve the loose upper bound (4.5), given i € 2P, 
let k be the number of 1s among its bits; there are (3) strings in ae sharing this 
feature. They can be listed and each of them identified by its number N; (i) in the 
list; notice that no more than [log, (A) bits are required to specify N; (i). One can 
thus construct an effective description of i”, by specifying k and N;(i™) in such 
a way that the UTM must be able to detach the specification of k, pg, from that of 
N (i). For this, one may do as in the proof of Proposition 4.1.1, by encoding px 
as (px), the binary string obtained from pg by repeating each of its bits twice and 
marking the end by 01. Since, €(G(px)) < 2 (logy k + 1), from Definition 4.1.3 it 
follows that 


n 
ci”) < log, (1) +2(logyk + 1). 


The following upper bound holds [113], 


MY — onma) 
BE ' 


k k k k k 
where Ao(—) := log» ad log») log,(1 — — log), which can be 
n n n n n 


derived by setting p = k/n in 


= n n j J j n—j n n = 
tC) G) (1 i) > (fp (l-p , O<kz<n. 


log, k + 1 


ee k 
Thus, —CG™) < Ho(—)+2 
n n 


n 
fixes, that is the initial n bits, of infinite binary sequences i € 22. Let0 < p < 1 be 
the probability of the bit 1; if k/n > p, then 


. Consider now the strings i” to be pre- 


ca”) 


lim sup 
no 


< A(m). (4.7) 


where H (zr) is the (log) entropy rate of a Bernoulli binary source with probability 
T = (p, 1 — p). The upper bound in (4.5) is thus not a loose one for p close to 1/2. 
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Example 4.1.6 One would expect the algorithmic complexity of a pair (i,j) of 
strings i, j € 2 to be smaller (apart from the usual additive constant independent 
of them) than the sum of the algorithmic complexities of i and j, namely: 


Ci, j < CM + CH +C. 


Intuitively, this should be so because one can always put together the shortest pro- 
grams p, respectively q for i, respectively j, in a program pq which is an effective 
description of (i, j). Unfortunately, the plain algorithmic complexity cannot enjoy 
the form of subadditivity expressed by the previous inequality. 

Indeed, if p,q are two programs such that C(i) = €(p) and C(j) = £(q), then 
any program using p and q to output the pair (i,j) must separate p from q, for 
instance by prefixing p with its length (p) encoded by (@(£(p)) (see the proof of 
Proposition 4.1.1) at the cost of 2(log £(p) + 1) extra bits. In this way, the reference 
UTM {Y first computes p generating i, then computes q, generating j and finally 
outputs (i, j). Thus, one estimates: 


CG, j))) < C(BE(p)) p) + L(G) + Co 
< C@ + CG) + 2log, &(p) + C1, 


where Co,; are additive constants independent of the strings considered. 

The log, €(p) extra bits cannot in general be avoided by reducing it to an additive 
constant independent of the input string. Indeed [144,366], let (i) = n, €G) =m 
and set k := n + m, there are (k + 12" pairs (i, j) such that the concatenated string 
ij € oe. By setting c = (k + 1)2* in (4.6), one gets that at least one pair (i, j) of 
such strings satisfies 


CG, j)) > k + logg(k+ 1). 
Then, using (4.5), from k = n + m = €(i) + €@) it follows that 


CG, J) > CG) + CG + loggk+1)—C. 


Remarks 4.1.4 7. Since the algorithmic complexities of i with respect to two 
UTMsis a constant independent of the string, one can fix a UTMU once and for 
all and drop the reference to it in Cy(i™). 

2. The additive constant A in (4.5) can in line of principle be very large; however, 
since it is the same for all target strings i, it becomes less and less important with 
increasing n. The additive constant can even be got rid of if, as in Example 4.1.5, 
one considers infinite strings i € R, their prefixesi™ € oy and letn —> œ in 

A) 
the algorithmic complexity per symbol ia ) 

3. The bound (4.6) shows that the one in (4.5) is not too loose for large n. In fact, 
the fraction of strings i™ with complexity smaller than n — c, 0 < ¢ < n, can be 
estimated by 


# {i e 2™ : CaM) <n-c} 
Qn 


a ae 
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Therefore, when n gets large, the number of strings with complexity significantly 
smaller than n gets small. 

4. In view of the previous remark, it is suggestive to define random those sequences 
i € Q such their initial prefixes i™ fulfil CA) >n — c for all n € N, where 
c is a constant independent of n. Unfortunately, the vary same reason why the 
plain complexity is not subadditive (see Example 4.1.6 makes this definition not 
very useful [366]. Fortunately, as we shall see in Sect. 4.3, by using prefix TMs 
to compute programs one replaces the algorithmic complexity CG™) with the 
so-called prefix complexity K(i™) and, in so doing, restores subadditivity and 
makes Ki”) > n — c for all n € N a good definition of random sequences i € 
Q [366]. 


4.1.2.1 Non-Computability of ci”) 

Algorithmic complexity is not computable; namely, there cannot exist an algorithm’ 
able to compute the C(i) for all strings. Indeed [300], if such a program q of length 
£(q) < œ existed, then, one could construct the following program p : 


e Step 1: letio equal the empty string; 

e Step 2: generate the k-string ix in the lexicographically ordered set of 
all binary strings, call for q and compute C(ig); 

e Step 3: if C(x) > €(p) write i, and halt else set k = k + 1 and 
go to Step 2. 


Since q, the program which computes the plain complexity of any input string, 
is assumed to exists, p also exists. Moreover, it has finite length (p) and halts with 
the first binary string, say i+ in lexicographical order, as output. Since the its plain 
complexity exceeds £(p), p is an effective description of ig» that is strictly shorter 
than its shortest possible effective description, which is a contradiction. 


Remark 4.1.5 [315] The non-computability of C(i) implies the undecidability 
of the halting problem, namely that there cannot exist an algorithm able to decide 
whether a UTM {4 halts when processing a generic program p. Indeed, if such an 
algorithm existed, then one could compute C(i”) for all i”. Effectively, one would 
proceed by generating the binary strings in lexicographical order (each one of them 
is a program) and subsequently processing them in dovetailed fashion [113,366]. 
That is, at stage 1, step 1 of program | is effected, at stage 2, step 2 of program 1 
and step 1 of program 2, at stage k, step k of program 1, step k — j + 1 of program 
j, 1 < j < k, and so on. At the N-th step, there will be three groups of programs, 


those that have halted with U(p) = i”; 
those that have halted with U(p) 4 i; 
those that are still being processed. 


7 A TM according to the Church-Turing thesis. 
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Notice that in the third group there might be shorter programs than those which 
have already halted. Let p* be one of the shortest in the first group. One cannot set 
Ca”) = &(p*) because it cannot be excluded that a program p in the third group, 
shorter than p, will halt later with U(p) = i”. However, if the halting problem could 
be decided, then one would exactly have this vital piece of information and, waiting 
long enough, would have a means to find the shortest one among those programs 
such that U(p) = i. 


In spite of the fact that the plain complexity is not computable, the previous remark 
provides a means to effectively approximate it from above; namely, one can construct 
a sequence of functions C; that can be computed by a UTM 4 on any binary input 
string i” and get closer to C(i™) with increasing n [144]. Let U, (p) denote the 
output of the computation by 4 of a program p that halts in t steps. By processing in 
dovetailed fashion the programs of length (p) < t, one can check whether during 
the first ¢ computational steps some of them has halted with output i, in which 
case one sets 


C,@) := min{€(p) < t : U(p) =i}, — C, (i) =+ otherwise . 
Finally, with reference to the loose upperbound (4.5), let 
Ci) := min{C,G™) , n+ A}. 


The function C; Gi”) can only decrease with increasing t, moreover, from Defini- 
tion 4.1.3, C,G™) > CG) so that it tends to the plain complexity of i™ mono- 
tonically from above. One says that the plain complexity is semi-computable from 
above. Notice that, although we know that the approximating values C; (i) tend to 
cai” ) from above, yet we do not know how far from the actual value cai™) any 
given C, (i) might be. 


Definition 4.1.4 A real function f on 2% is called semi-computable from above 
if there exists a non-increasing sequence of functions {fg}ken on 925 with ratio- 
nal values® such that they are computable in the sense of Definition 4.1.2 and 
limp oo fei) = fG™). A real function f on 2% is called semi-computable 
from below if — f is semi-computable from above. A real function f on QQ is 
computable if it is semi-computable both from above and below. 


Remarks 4.1.6 /. The difference from semi-computable and computable functions 
can be understood as follows. If f is computable then there exist two monotone 
‘ i a,b a n r 
sequences of rational-valued computable functions { f> }ken, fg non-increasing 
and F? non-decreasing, such that 


¡m li a,b g0 
fa) Ped a’) 


8 Any P/q, p,q € N, can be written as a binary string (p, g) € 23. 
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It follows that one can always estimate, for any i € 923, the distance between the 
computed values fé Pi) and the actual value f (i) by means of the computable 


difference ff) — fP). 

2. The approximations f; (i) of a function f (i) semi-computable from below can be 
seen as the result of a same program (binary string) p f. When a reference UTM 
U is presented with pf, together with the binary representation i(k) of k and 
an input string i € QX, it computes f;(i), that is U((p¢, i(k), i)) = fx), where 
(pz, i(k), i) is the binary string which encodes and separates the various inputs. 
Consequently, as well as computable functions also semi-computable functions 
can be enumerated. 


An interesting class of lower semi-computable functions consists of the so-called 
constructive semi-measures [144,366]. 


Definition 4.1.5 A positive function u: 23 +> R is called a semi-measure if 
dies (i) < 1 and a constructive semi-measure if it is semi-computable from 
below. A constructive semi-measure m : 923 +> R is called a universal semi- 
measure if for any constructive semi-measure js there exists a constant C, such 
that 


C uâ) <m) Vie 2. 


Working with semi-measures ju instead of measures allows for more freedom; for 
instance constructive measures turn out to be automatically computable. Namely, 
if fg is a non-decreasing sequence of rational-valued computable functions that 
approximate u from below and -i< ax yi) = 1, one can construct a computable 


approximation 0 < uk < u such that, given £ > 0, Dies pei) > 1] — £e. Then, 
for all i € 92% it holds that 


IG) — mOl Do UO) — ue@) <e- 


A * 
ic?) 


Example 4.1.7 [144,366] Constructive semi-measures can be enumerated (see 
Remark 4.1.6.2); let {un} denote their list and let {aæ(n)}nen be lower semi- 
computable positive numbers such that >, a(n) < 1. Then 


m = ) an) pn = aku Yuk. 


n 


m is thus a dominating semi-measure, it is also constructive and thus universal in the 
sense of Definition 4.1.5; indeed, there exists a two-argument lower semi-computable 
function u(i™® , n) that reproduces all constructive semi-measure by varying n € N. 
The idea of the proof is as follows. Given a lower semi-computable function f 
and a non-decreasing sequence of rational-valued approximations fk, let pp the 


134 4 Algorithmic Complexity 


binary program that allows a reference UTM 4 to compute them as outlined in 
Remark 4.1.6.2 and let {ij}jen be the lexicographically ordered list of all binary 
strings. By computing them in dovetailed fashion, let then U A be the computable 
function defined by 


k ay — J Ups, ilk), ij) if €G;) <k 
Ups Gj) = | 0 otherwise ` 


Notice that U 7 — f when k — +00; then, consider the recursive effective proce- 
dure consisting of the following steps: 


1. set Hp ij) = 0; 
2. stk=k-+1; 


3. compute ij, i2, ... , ix in dovetailed fashion; if some U . 


(i;) has not halted go to 
k kes. 
Step 5, else compute } `}; Up, Gj); 
4. if Di Up, ij) < 1, set He, = Upp and go to Step 2, else 
5. set ie = ie and stop. 


By construction, the function wai”, Pf) = lime +00 es Gi”) is lower semi- 
computable and a semi-measure; further, it coincides with f if the latter is itself 
a constructive semi-measure. 


4.1.2.2 Algorithmic Complexity and Thermodynamics 

Beside its many mathematical applications, algorithmic complexity has also been 
used to explore the relations between computation and thermodynamics [67,68, 70, 
315,366]. As already remarked in this section, computing is a physical process and 
questions about its thermodynamic cost is surely of practical importance, but also of 
general interest as they amount to asking which computational steps are intrinsically 
irreversible and which ones can instead be performed reversibly [67]. As nicely 
illustrated in [315], trying to answer these questions brings together thermodynamics, 
computability theory and Gédel incompleteness theorem. 

The starting step is the observation [67, 139,222] that the only irreversible com- 
puter operations are intrinsically logically irreversible, namely those with outputs 
that do not uniquely identify the input. The most obvious instance of such operations 
is erasure and, as an oversimplified case, consider one molecule of gas contained in 
a cubic box of volume V in which a freely moving piston can be used to confine the 
molecule on the left side of the box, a case which is read as a bit 1. The flip operation 
which turns 1 into 0 can be effected reversibly by slowly rotating the box around its 
vertical axis and thus exchanging its right and left sides. 

In order to erase these two bits of information, the piston can be let loose so 
that free expansion (of one molecule) allows the molecule, which was confined in a 
volume V /2 before, to wander later within the whole volume V. If the process occurs 
isothermally at temperature T, the loss of information corresponding to the increase 
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of the space at disposal corresponds to an increase in thermodynamical entropy and 
decrease of free energy (the internal energy does not change in isothermal processes): 


AS =k log2, AF = AU — TAS=-—kT log?2. 


By extrapolating this simple observation, one is naturally led to the identification of 
free energy and free memory: one can consume free memory to store data instead of 
erasing them and in this wave saves free energy, or, vice versa, by consuming free 
energy in erasure processes one saves free memory [315]. 

Differently from erasure which can in no way be turned into a reversible operation, 
all other operations are only superficially irreversible and can be made reversible by 
adding enough information [67]. For instance binary addition maps the pairs (0, 0) 
and (1, 1) into 0 and pairs (0, 1) and (1, 0) into 1. Therefore, by reading off 0 (1) one 
cannot decide which couple of bits was the input; however, conserving the inputs 
and writing them together with their outputs turns the binary addition (@) into a 
reversible operation: 


0G0=0 (0, 0) — (0, 0, 0) 

081=1 (0, 1) —> (0, 1,1) 

160=1 °> (2,0) (1,0,1) ` 
161=0 (1,1) hr (1,1,0) 

— << eS 

irreversible reversible 


Unfortunately, the redundant information that is used in order to make operations 
reversible has to be stored and this occupies free memory so that massive erasure 
operations are eventually needed, free energy consumed and heat waste generated. 
In order to minimize free energy consumption, one can first proceed to reversibly 
compress as much as possible the stored information to be erased. For instance, in 
the case of the binary addition, one can use only the first input bit since the second 
one can be recovered by binary subtraction (©) from the output bit: 


(0,0) (0,0) , 0S0=0 
O)rO1l , 190=1 
Gord), 161=0° 
d,)rd,0) , 0el=!1 
— 


still reversible 


Suppose the occupied memory consists of a binary string i”), then the best compres- 
sion achievable is given by the shortest binary program p* such that U(p*) = i” 
whose length is the Kolmogorov complexity C(i®). Reversibly encoding i” 
into p* and erasing the latter entails the optimal loss of free energy Aopr F = 
—« T C((i™) log 2 to be compared with AF = —nk T log 2. 

These considerations suggest [387] that, when dealing with the thermodynamics 
of computation, the notion of entropy should be improved by the addition to the 
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standard thermal contribution, S;;,, of the one coming from the optimal erasure of 
the memory 


Scomp = Sın + KC(M) log2, 

where C(M) is the algorithmic complexity of the computer memory. For instance, by 
using Scomp, the Maxwell’s demon paradox [225] can be solved by observing [387] 
that S;nerm can indeed be diminished by the demon collecting together all fastest 
particles and transferring heath from lower to higher temperatures. However, storing 
all the information necessary to comparing particle velocities rapidly consumes free 
memory and asks for erasure thus restoring the second law of thermodynamics. 

Unfortunately, the main problem with optimal compression is that it is based 
on the knowledge of the algorithmic complexity of the occupied memory which 
cannot always be computed. In few words, performing an optimal compression of 
the memory content before erasure is not always possible and there will always be 
an excess of free energy consumption. As this is ultimately due to the undecidability 
of the halting problem, this effect can be suggestively and not unduly called Gödel 
friction [315]. 


4.2 Algorithmic Complexity and Entropy Rate 


Despite Remark 4.1.4.4, there is a sense in which the Kolmogorov complexity 
can be used to look at the individual trajectories of a classical dynamical sys- 
tem (X, T, p) and at their randomness, namely through their asymptotic complex- 
ity rate. As explained in Sect. 2.2, a partition P of 4 provides a symbolic model 
(2 p» To, up) whereby trajectories are reduced to sequences i € 2 p © Qp of sym- 
bols from an alphabet with p letters of which one can study the olek of the 
prefixes i™ € ay? 

As for the Shannon entropy, when dealing with sequences, one may decide to 
focus not on the Kolmogorov complexity which generically diverges, rather upon its 
rate or complexity per symbol [8,88]. 


Definition 4.2.1 The complexity rate of a sequence i € 2 p is given by 


c(i) := lim sup — Lego), 


n—-> Oo 


where i is the initial prefix of i of length n. 

Given a dynamical system (X, T, u) and a finite, measurable partition P of X, let 
i(x) € 2p denote the symbolic trajectory that P associates to the trajectory {7 x}n>0 
issuing from x € X. Then, the complexity rate of {T”x}n>0 with respect to P is 
c(x, P) := c(i(x)). 


? In order to do this, one has to extend Definition 4.1.3 to the case of strings of symbols from generic 
finite alphabets. This is straightforward and will always be understood in the following. 
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To start with, we shall consider the case of a dynamical system which is itself 
already a symbolic model, namely a binary information source. An important result 
is that, typically, for sequences emitted by ergodic sources, the bound (4.7) becomes 
an equality in the limit. 


Theorem 4.2.1 (Brudno’s Theorem) Let (22, To, T) be a binary ergodic source 
with entropy rate h(n). Then, 


1 
c(i) = lim —Ca™) = h(n), (4.8) 
n>o n 
for almost alli € S22 with respect to 7. 


The proof [88, 199,376] consists (1) in using the counting argument (4.6) and the 
AEP (Proposition 3.2.2) to show that 


1 
liminf -C(i) > h(t) a7-ae; (4.9) 
n> n 


and 2) in providing, for the initial prefixes i) of m-almost all i € 25, an appropriate 


£ a 
binary program pj(n) € 925 such that lim pim) < h(n) and W(pi(n)) = i™ 
n—> o0 n 
whence 
1 
lim sup -—CG™) < h(t) m-—a.e. (4.10) 
n>œ M 


Proof of the lower bound: Because of the assumption of ergodicity, Theorem 3.2.1 
allows us to use the AEP with the entropy rate h(7) in place of the Shannon entropy 
H(A). Let AM & a be the set in (3.27) consisting of binary strings i™ such that 


QnAaM+9 < n(i™) < 2-h(7)—6) , 


and A” C Mh the set of sequences whose initial prefixes of length n, i” belong to 
A and have complexity C) < n(h(r) — 2€). From (4.6), it follows that 


n( Ae) = r(fi® € A” : CA) < nae) - 20}) 


< a( âm) - max mi”) 
iea 
< on(h(n) 2e)+1 | 2 n(h(m)—€) =2 ne+l 


Since strings i” ¢ A” may also have complexity C(i™) < n(h(r) — 26), it is 
necessary to control their overall probability. Set (Ae = RA ÂP and 


A® := {i e (AM): CaP) < khm) — 20)} , Bes | Ars 


k>n 
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Since A” c (ÂP): implies n( Bo”) < m(LA®y’) ajs (1 A®), it fol- 
k>n k>n 


lows that the probability of the set of sequences whose initial prefixes have complexity 
ci™) > n(h(m) — 22) is estimated from above by 


(U {4 u aw) <x(LJ A) +2(80) 
k>n k>n 
g-netl n 
DP ae n( Bo”) < (() Aw) : 


k>n k>n 


The set Neen A ®© consists of sequences i € 22 whose initial prefixes are typ- 


ical for all oe k > n, therefore lim =(N A®) = 1. It thus follows that 
n—->oo ~ 


k>n 
ee CON) ETSE 
inf > h(m) — 2e n-almost everywhere. Since £ is arbitrary the lower bound 
n>n 
follows. 


Proof of the upper bound: Given 2P >i™ = ijin---in, fixO < L < n and con- 
sider all strings of length L made of consecutive bits of i; there are n — L + 1 of 
them: 


Sk i= ikik41 >t iL4k-1, L<k<n-L+l1. (*) 


Let Qs denote their set and let N (s) be the number of occurrences of the string 


SE a), ; N (s) can be expressed as follows. Let i € 22 be any sequence with initial 


prefix of length n equal to i”, then 


n—L+1 
N(s)= X` xs(T{@). er) 


j=0 


where T, is the left shift and x (Tå (i)) is 1 if the initial prefix of length L in Ti (i) 
equals s, 0 otherwise. 
Given the N(s),5 € Qe , one can thus construct a so-called empirical probability 


distribution T on Qs 


N(s) 
L 
ae a {p\(s)} > p(s) := = n= LFI 
with corresponding Shannon entropy 
L 
H (mim) =- X p(s) logy pe). 


(L) 
SEQ in) 
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Notice that the set of lengths £(s) := [— log» p(s) is such that 
— log, pi’ (s) < (s) < — log, py (s)+1; (=x) 


therefore, they satisfy the Kraft inequality 


5 2740) < by p)(s) oe 


sen SERC 
Because of Proposition 3.2.1, there thus exists a binary prefix code over the strings 
SE On consisting of codewords w(s) of lengths €(s) := €(w(s)). 
With sx as defined in (*) above, fora given 1 < j < L — 1, consider the adjacent 
strings of length L of the form Sj+pjL> O<pj< py Since the first bit of sj is 


i; and the last bit of Sj prey is i j++- b then the bit not belonging to any 


Sj+p;L are iji2 -++ ij andi i j4 HILE f+ (ph 41)L4 1+- Ín, whence 


n—j-L+1 
L 


> 


for a total of no more than 2(L — 1) bits. Also, since any | < k < n can be written as 


k = j + pL with 1 < L — 1 and0 < | uniquely determined, then, for different 
1< j < L- 1, the sets $; := Dad a do not overlap and i- S= 2%), 
=f = ý 5 j=! pj=0 j=1 Ys im: 

Consider a program Q; that reconstructs i™ by specifying the codewords 
w(Sj+p;L) plus the bits uncovered by them; its length can be bounded from above 
as follows: 


max 


Pj 
&(Qj)<C+2L—-1)+ X` Lsj4p;1), 
pj=0 


where C is a constant independent of j and of L. Further, (x x x) entails the following 
bound for the plain algorithmic complexity of i”: 


A 


L-1 
1 
Ca) = min Mojs gA 
<i = 


max 


1 L—1Pj 
SCHAUL- D+) 2 tor) 
j=1 p=0 
1 
=C+AL-1) + }, NOXO) 


(L) 
SEQ. i0) 
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H+ 


i(n 
From ergodicity and (+>), it follows that, when n > ov, 


N(s) 


a [0,4—1] 
a rosne ) 


0,L—1] 


for m-almost all sequences i € 22, where C I is the cylinder set containing 


tends to the 


| of §22 indexed by 


all i € Q2 with s € an as initial prefix. Thus, when n — ov, 7 
0,L-1] 


probability distribution 7 over the partition C® = [c l oD 
SES 


the strings s € a, then, by continuity, 


H(C)) +1 
L-1 ’ 


1 
lim sup -C(™) < 
n 


n—> o0 


By taking L — oo, the upper bound follows (see Remark 3.1.1.1). 

The previous result that holds for ergodic binary information sources can easily 
be extended to generic ergodic sources and then to ergodic dynamical systems via 
Definition 4.2.1. 


Proposition 4.2.1 Let (X,T, u) be an ergodic dynamical system and P a finite, 
measurable partition of X; then 


c(x, P) =h% (T,P) p-ae. 


Proof The partition P defines a symbolic model (2 p» To, up) which is an ergodic 
shift-dynamical system. The result follows since Brudno’s theorem ensures that for 
up-almost all i € 24, hence for p-almost all x € X, it holds that c(i) = h(wp) = 
hKS (T, P). 


Corollary 4.2.1 Let (4, T, 4) be an ergodic dynamical system and P a finite, mea- 
surable generating partition of X; then 


c(x, P) =i (T) p-ae. 
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4.3 Prefix Algorithmic Complexity 


A way to eliminate the logarithmic correction that spoils the subadditivity of the 
plain algorithmic complexity (see Example 4.1.6) is to ask that the only acceptable 
programs for the UTM 4Y are the so-called self-delimiting ones, namely those con- 
taining the specification of their lengths, so that the UTM always knows when its 
input programs end. These programs have the prefix property that if 4 halts on one 
of them, say p, then p cannot be the prefix of any other halting program for U. Any 
TM that accepts only programs with the prefix property is called a prefix TM ; it 
can be showed [97] that there exist prefix UTMscapable of simulating the behavior 
of any other prefix TM . The consequences of the prefix constraint are far reaching. 
One first proceeds to define an adapted version of algorithmic complexity of binary 
strings (the extension to strings from different alphabets is straightforward). 


Definition 4.3.1 (Prefix Algorithmic Complexity) The prefix algorithmic complex- 
ity of i) e€ a is the length of the shortest program p such that U(p) = i, where 
{is any chosen reference prefix UTM : 


KG) = min fe) : U(p) =i™ , Ua prefix UTM | . 


Remarks 4.3.1 /. A prefix TM can be figured out [97] as a TM with a control unit, 
two tapes and two reading-write heads. The first tape, the program tape, is entirely 
occupied by the program which is written as a binary string between two blank 
symbols marking its beginning and its end; the program is read by a head that can 
only read, halt and move right. The second tape, the work tape, is, as in the case 
of an ordinary TM, two-way infinite and the head on it can read, write 0,1, leave 
a blank #, halt or move both right and left. The computation starts with the head 
on the program tape scanning the first blank symbol, the other head on the O-th 
cell of the work tape, only finitely many of its cells possibly carrying non-blank 
symbols, and with the control unit in its initial ready state q,. Then, in agreement 
with the symbols read by the two heads and the control unit internal state, the 
head on the working tape erases and writes or does nothing and then moves left, 
right or stays, the head on the program tape either moves right or stays, while the 
control unit updates its internal state. The computation terminates if the reading 
head on the program tape reaches the end of the program, in which case, the 
output is what is written on the work tape to the right of the cell being scanned 
by the head until only cells with blank symbols are found. The program halts if 
and only if the head on the program tape reaches the end of the tape. 

2. Since the set of programs with the prefix property is smaller than the set of all 
programs, then 


Ccai™) < KG™). 
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On the other hand, if p is such that cai”) = (p), then, considering its self- 
delimiting encoding p* := 3(€(p))p, it follows that 


Ki) < £(—p*) < CAM) + 2 log (pp) + C.. 


ye 


The prefix complexity is subadditive; in fact, if p and q are programs such that 
K@ = £(p) and K(j) = £(q), with i,j € 2*, then, since p and q are now, by 
definition, self-delimiting, one has 


KG, j) < K@ + KG) +C. 


A 


Unlike for the plain complexity (see Remark 4.1.4.4), one can rightly define ran- 
dom those sequences i € R for which 


Kdi™) >n-c, 


for all their prefixes i™, that is all those sequences whose prefixes i” have 
prefix complexity that increases at least as n. Indeed [366], it turns out that these 
sequences are those and only those passing all constructive statistical Martin- 
L6f tests checking whether they belong to effectively null sets (see Footnote 1). 
In this sense, relative to the prefix definition of algorithmic complexity, Levin’s 
chaoticness and typicalness mentioned in the introduction to this section are 
equivalent characterization of randomness. 


One of the most important consequences of working with prefix UTMsi is that 
their halting programs p form a set of prefix codes for the output strings L(p) = i € 
(25 and their lengths satisfy the extended Kraft inequality (3.2.2). 


Example 4.3.1 Consider a prefix UTM {4 and the so-called Chaitin magic num- 
ber [66,99, 113] defined by 2 = > JAP X where the sum runs over all halting 
p: Up} 
programs p; because of the prefix property, 2 < 1. 
Let us consider the binary expansion of 2 which has infinitely many Os if it is 
rational and suppose an algorithm exists that calculates the digits of 2. Then, the 
n 


n-digit approximation 2, := > oF is such that 22, > 2. — 2”. Then, one knows 
j=l ° 
whether 4 halts on programs of length < n. 
Indeed, by listing them in lexicographical order and by processing them in dove- 
tailed fashion, one can collect all programs pj, p2,... that halt until, after T (n) 
computational steps, 


m(n) 
Sa = X 2s Q. 
i=l 
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If p is any program halting in more than T (n) computational steps, one gets 
R > Sy, HEP > Q, +27) > 24+ 2-P) — 2, 


Therefore, €(p) > n so that if a program of length shorter than n has not halted in 
T (n) computational steps it will never halt. 

Let G (n) be the set of strings i; := U(p;), j = 1,2,..., m(n), corresponding to 
the outputs of the programs that have halted in T (n) computational steps and let i 
denote the first string (in a suitable order) not in G (n). Such string must have prefix 
complexity K(i) > n: indeed, if K(i) < n, there would exist a program p of length 
< n such that U(p) = i. However, from the previous discussion one deduces that also 
p must have halted in T (n) computational steps so that i € G (n), too. Further, let p* 
be any shortest effective description of the string Q := ww --- wy, consisting of 
the first n bits of 2, namely K(Q™) = £(p*). Then, by means of a fixed number c 
of extra bits, one can use the knowledge of the 2™ to recover i, whence 


n <K(i) < €(—p*) + c= K(Q™) + c = K(QM) > n—c Yn. 
Then §2 is a random sequence in the sense of Remark 4.3.1.4. 


Definition 4.3.2 Given a prefix UTM Y, the map 2% > i œ> Py(i), where 


Py (i) = 5 27E) (4.11) 
p:U(p)=i 


defines a so-called the universal probability on 25. 


This definition makes sense, for, as a consequence of the prefix property, not 
only (4.1) holds, but it also turns out that 


XY rus’ eS 


ie QZ ie Qz p: M(p)=i 


Remarks 4.3.2 1. Ifa prefix TM % halts on p = O and q = 1 with the strings i and 
jas outputs, then Py (i) = Py (j) = 1/2 since no other program can halt. Without 
the prefix restriction the sum in (4.11) would diverge simply because all programs 
prefixed by p and q would also output i and j. 

2. After division by X ico» Py (i), Py (i) represents the probability that i be the output 
of U running a binary program p of length £(p) randomly chosen according to 
the Bernoulli uniform probability distribution that assigns probability 2~"”) to 
anyone of them. Since short programs have higher probabilities, random strings 
have smaller algorithmic probabilities than regular ones. 
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3. The probability Py is called universal (see Example 4.1.7) for the following 
reason. Let & be any prefix TMand q a program such that A(q) = i € 25; further, 
let q' be a self-delimiting program of fixed length L that makes 4 simulate A so 
that U(q'q) = Uq) = i. Then, 


P= Yo POs PY 2t@Marl rw. 12) 
p:U(p)=i q : U(q'q)=i 


Suppose now T = {p(i)}ie o; to be a computable probability distribution over 25 
(see Definition 4.1.2). Consider a prefix TM 2 that does the following: 


© it computes the probability distribution 7; 

© it encodes the strings i € 23 by means of the Shannon-Fano-Elias code cor- 
responding to the computed n (see Example 3.2.3); 

© givenaprogramg € 825, it checks whether q is the Shannon-Fano-Elias code 
for anyi € 923; if so, it outputs i. 


Since the lengths of the code-words are as in (3.23), then, for alli € 23, 


Pal) = D> 2 > 4pii). 
A(q)=i 


For the prefix UTM 4 to work as %, it is necessary to compute the probability 
distribution 7, whence the program q’ in (4.12) is such that L = K(m) + L’, where 
K(a) is the prefix complexity of n, where it is understood that the computable 
probability distribution m is written as a binary string (denoted by the same 
symbol). Then, for all computable probability distributions 7 on Q5, 


Py(i) > C2-*™ på), (4.13) 
with C > 0 a constant independent of i and 7T. 


Universal probability, prefix complexity and Shannon entropy of computable 
probability distributions are intimately related. Given a prefix UTM UU, the programs 
p* such that 4(p*) = i € 2% with €(p*) = K(i) provide a prefix code such that 


P= $ 2 eo ee, (4.14) 
U(p)=i 


Further, if the strings i are chosen at random with respect to a computable probability 
distribution 77, then (3.2) implies that the corresponding average length, namely the 
average prefix complexity, satisfies 


Y POKA = Ma) = — > p@logy pi). (4.15) 


ie Qz ie Qz 
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There might be infinitely many programs such that (p) = i, yet the lower bound 
in (4.14) is surprisingly good as the sum is actually dominated by the shortest pro- 
grams for i. 


Proposition 4.3.1 For all i € 23, Py(i) < C2-*%, where C > 0 is a constant 
independent of i. 


Together with (4.14), this result permits the identification (up to an additive con- 
stant) of the prefix complexity of a string with minus the logarithm of its universal 
probability. 


Corollary 4.3.1 K(i) = — log, Py(i) + O(1). There thus appears a similarity between 
the fact that the optimal code-word lengths with respect to a probability distribution 

T = {p@hier are of the form t* = — log, p(i) and the fact that the lengths of the 
shortest descriptions of binary strings practically amount to the logarithm of their 
universal probabilities. This similarity can be carried even further by examining the 
average complexity. 


Corollary 4.3.2 Given a computable probability distribution n on QX, the corre- 
sponding average prefix complexity satisfies 


Hom) < D> p@ KG) < Hor) + Km) +C. 
iey 
Proof From Proposition 4.3.1 and (4.13) 
K(i) < — log, Py(i) + C < —log, pi) + Km) +C. 


Multiplying by p(i) and summing over i € 92% yields the upper bound, while (4.15) 
gives the lower bound. 


Proof of Proposition 4.3.1: The idea is to construct, for each i € 23, a set of pro- 
grams p of length £(p) < — log) Pu (i) + C’ with the prefix property such that 
U(p) =i, so that K(i) < £(p) would end the proof. Unfortunately, the argument 
of Remark 4.3.2.3 is not viable as the universal probability is not computable. 
However, as much as for the plain algorithmic complexity, the prefix complexity 
is semi-computable from above whence the universal probability results lower semi- 
computable because of Corollary 4.3.1; this turns out to be sufficient for constructing 
a prefix code with the desired property. Let all the programs (listed in lexicographical 
order) be run by £ in dovetail fashion and collect them in pairs (pz, xx) where px is 
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the program which halts at the k step of the dovetailed computation with x, € 92% 
as output. The quantity 


Pulk, x =x):= J 2P < Pua) 


(Pi Xi =X) 
i<k 


is computable and tends to Py (x) along the subsequence {(px, Xk = X)}x3; set ng := 
[— log) Py(k, xk = x)]. Since 


PAN 2 Pukas a O, 


where (k) is the smallest length in the sum, it follows that ng = ¢,(k). Given 
(Pr, Xk, Nk), this triplet is assigned to the first non-occupied node at the (ng + 1)- 
th level of a binary tree; further, in order to enforce the prefix condition, all nodes 
stemming from it are made unavailable to further assignments. Since n, is not strictly 
monotonic, it may happen that different pairs (p;, x; = X), i < k, have the same 
nx; by eliminating all but the first pair with that value of ng, no more than one node 
will be occupied by a triplet with the same x, at level ng. Therefore, 


ng = — log, Py lk, xk = x) = — logy Py (x) => ng = [— logy Py(x)] + rk 


with rę > 0 and ry Æ rj for j # k. To each x € 25 there correspond many assign- 
ments of triplets (pz, Xk = X, ng) each one of them to one and only one node at level 
ng + 1. The nodes thus provide binary code-words of length ng + 1 for the triplets. 
In order to see that there are sufficiently many nodes to accommodate all triplets, we 
check that the lengths ng + 1 satisfy the Kraft inequality (3.21). That this is indeed 
so follows from the fact that 


> ank — 27T- log: Pus) > 277 <2 P(x), 


X,=xX X,=x 


for all x € 27, whence 


= > 25l < x Py (x) <1. 


* = * 
XENY XK=X XENJ 


The above algorithm allows the construction of a binary tree whereby any x € 92% 
can be identified with the binary string i(x) € 2% corresponding to the lowest depth 
node assigned to its triplets (px, Xk = X, nx). The length of the code-word i(x) € Q% 
is the smallest ng + 1: 


£(i(x)) < [—logs Py(x)] + 1 < — log, Py(x) + 2. 


Finally, let q be a program of fixed length L that makes the prefix UTM 4U generate 
the binary tree by dovetailed computation as specified above and let q’ be another 
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program of fixed length L’ with the necessary instructions to 4 such that, when 
presented with the code-word qq i(x), U computes the program in the triplet assigned 
to the node marked by i(x), writes the result and halts. By construction, the program 
p in the triplet at the node i(x) is such that U(p) = x; then 


K(x) < €(q’q i(x)) = — logy Pux) +L +L'+2. 


Remark 4.3.3 Because of its construction the universal probability is a lower semi- 
computable semi-measure (see Definition 4.1.5), thus there exists a constant Cp 
such that Cp Py < m, where m is the universal semi-measure constructed in Exam- 
ple 4.3.2. Furthermore, an argument similar to the one in the previous proof, extends 
the result in Corollary 4.3.1 to 


K(i) = — logy Py(i) + O(1) = —log, m(i) + O(1). 


4.3.1 Bibliographical Notes 


The books [98—100] provide inspiring and motivating introductions to algorithmic 
complexity theory, as well as the reviews [358,388], the latter one being especially 
devoted to comparing several characterizations of randomness. A reference book 
is [366] which also provides a historical overview of the development of the theory, 
several applications to a variety of mathematical, logical and physical problems, 
as well as more recent advances. A more abstract presentation is offered by [92] 
whereas a handier introduction can be found in [144] and in [113]. In [8] algorith- 
mic complexity is discussed in relation to stochastic processes, in [155] in relation to 
entropic tools in classical information theory and in [300] in relation with coding and 
statistical modeling. Finally, [114] presents an introduction to computable functions, 
recursion, the halting problem, Gédel’s theorem and computational complexity. Pos- 
sible uses of algorithmic complexity theory in relation to the predictability of discrete 
classical dynamical systems are discussed in [78]. Approaches to complexity issues 
in a broad sense can be found in [19,346]. 


Part Il 


Quantum Dynamical Systems 


In the second part of the book, quantum dynamical systems with finite and infi- 
nite degrees of freedom are presented by using the algebraic approach to quantum 
statistical mechanics. Particular emphasis is given to the notion of completely pos- 
itive maps in connection with the dissipative dynamics of open quantum systems, 
both in the memoryless and the non-Markovian scenarios, and to their bearing 
on the behaviour of information and entanglement. The corresponding technical 
framework proves convenient for the extension of ergodic and information theory 
to non-commutative contexts. 


® 
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Quantum Mechanics of Finite Degrees 
of Freedom 


Quantum dynamical systems are described by means of non-commutative algebras 
of observables, by means of their time-evolution and by means of the expectation 
functionals that assign mean values to them. Classical dynamical systems can always 
be described in terms of phase-points and phase-trajectories; however, an algebraic 
formulation is always possible and has two advantages: on one hand, similarities and 
differences with respect to quantum dynamical systems become more evident and, 
on the other hand, one can infer from the algebraic reformulation of classical notions 
how to possibly extend them to the quantum setting. 

With reference to information, the most important difference that one encounters 
passing from the commutative to the non-commutative setting is that the disturbances 
exerted on quantum systems by measurement processes cannot in general be made 
negligible, not even in line of principle. 


5.1 Hilbert Space and Operator Algebras 


In standard quantum mechanics, physical states are usually described by normal- 
ized vectors in separable Hilbert spaces, and the observables by self-adjoint linear 
operators acting on them. Here follows some notations and basic facts. 


1. |w), |), or 9, ġ, and |i), with i running on a suitable index set J, will 
denote (normalized) vectors in Hilbert spaces H and Py = |% )(w| the asso- 
ciated orthogonal projectors. 

2. The scalar product on H, denoted by (4% | @), linear in the second argument and 
anti-linear in the first one, satisfies the Cauchy-Schwartz inequality 


Kl) < lolol- (65.1) 
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5 Quantum Mechanics of Finite Degrees of Freedom 


Any finite or countable set {Y%};c7 C H such that ( W; | W; ) = 6;; and |Y) = 
Vier (“% |v) | Wi) for all y e H, is an orthonormal basis (ONB) in H. The 
corresponding projectors P; := | W; )( W; | fulfil 


A= ens, (5.2) 


iel icl 


where Il denotes the identity operator on H, 1| Y) = |w) for all y € H. 

. Given any linear operator X on H, its matrix elements with respect to 7, @ € H 
will be denoted either as (Y| X@) or as (Y |X |d), depending on notational 
convenience. 

. XT and X* will denote transposition and complex conjugation with respect to a 
given ONB {W;};: 


(WI XTW) = (GX), (HIX W) = (W XW). 


Instead, X* = (XT)* = (X*)? will represent the basis-independent adjoint of 
X: 


(b|X'd) = (Xylg) =I Xy)" Vo, ġEH. 


Physical observables correspond to self-adjoint operators X = Xİ. 
. The uniform norm, ||X ||, of a linear operator X on H is defined by 


IXI = sup |X| 4%). (5.3) 
I%I=1 


. X is bounded if || X || < œœ, in which case 


IXI < XA, KAL Xd) < Xo l. (5.4) 


Linear combinations of bounded operators are again bounded; their linear span 
will be denoted by B(H). The product of bounded operators is bounded, for 
IXY w) |] < |X] ||| Yll. Therefore, BCH) is a so-called x-algebra. 

. An operator U € B(H) such that U' U = 1 is called an isometry; in general, 
UUt=(UU')\UU') isa projection, if also U Ut = 1, then U isa unitary 
operator. Isometries have ||U || = /||U+ U|| = || 1|| = 1. 

. The uniform norm defines on B(H) uniform neighborhoods of the form 


U(X) = {Y c BH); IX-Yil<e, e€20, (5.5) 


whence a sequence X, € BCH) converges uniformly to X € BCH), limno Xn = 
X, if limy— oo |X — Xn|| = 0. The corresponding topology on B (H), T4, is called 
uniform topology. 


5. 


n 


10. 


11. 


12. 


13. 


14. 
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. B(H) is complete with respect to the uniform topology, namely all sequences of 


operators which are of Cauchy type with respect to the uniform norm converge to 
an element of BCH). Therefore, B(H) is a so-called Banach x-algebra. Moreover, 
since the uniform norm fulfils 


IX = IXI, XTX = IX, (5.6) 


B(H) is a C*-algebra (see Sect. 5.2). 

In the case of n < oo degrees of freedom, each one of them is described by a 
Hilbert space H;, 1 < j < n. Altogether, their Hilbert space is the tensor product 
H™ = @j=1 Hj, denoted by H®" when the Hilbert spaces H ; are copies of a 
same H. Depending on notational convenience, its vectors will be denoted either 
by |Y) =|¢1) @1¢2) 8: |Yn) or by |Y) =|]41@¥2@---Yn), with 
scalar products (o | Y) = ITj- 1( $; | Yj ). Bounded operators on H are linear 
combinations of tensor products of the form X1 @ X2 Q --- Xn, X; € B(H;); the 
associated C* algebra of bounded operators on H is B(H®™®) := Q= BŒH;). 
The strong topology on B(H), Ts, is the smallest topology with respect to which 
all semi-norms of the form Ly (X) := ||X| y )|l, Y € H, are continuous; its strong 
neighborhoods are of the form 


U: (X) := {Y € BŒ) : Ly (¥-X) se, 1<j<n}, (5.7) 


fory; € H,n € Nande > 0. A sequence X, € B(H) converges strongly to X € 
BH), s — limy+oo Xn = X, if limy—oo || (Xn — X)| w)|| = 0 for all y € H. 
The weak topology on B(H), Tw, is the smallest topology with respect to which 
all semi-norms of the form Ly y (X) := |(¢| XY )|, @, Y € H are continuous; 
its weak-neighborhoods are of the form 


U? (X) := {Y € BŒ) : Legy Y- X) se, 1<j <n}, (5.8) 


for Yj, pj e H, n eN and e > 0. A sequence X, € B(H) converges weakly 
to X € B(H), w — limyn +o Xn = X, if limno |( ¢ | (Xn — X )| = 0 for all 
ġ, y e H. 
Since strong neighborhoods are uniform neighborhoods, but the reverse is not 
true when H is infinite dimensional, the uniform topology is in general finer than 
the strong one, that is 7, has more neighborhoods than Ts: Ts < Tu. The weak 
topology is in general coarser than the strong one; every weak neighborhood is 
also a strong neighborhood, but the reverse fails to be true in infinite dimensional 
H. The norm, strong and weak topologies are equivalent in finite dimension. 
Among other topologies on B(H) [80], one of some use in the following is the 
a-weak topology, Tw; it is finer than the weak topology for it is the smallest one 
that makes continuous the following semi-norms, 


LHe aX) = >, Gn |X ln (5.9) 


where {wn}, {¢n} C Hare such that $, llyn]? < oo and X, lonl? < co. 
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Most of the previous assertions are standard facts [80,296,353]; however, the 


various topologies on 


Remarks 5.1.1 


B(HI) deserve a closer look. 


1. That the uniform topology is finer than the strong topology can be seen as follows. 


COES 


Given any strong neighborhood U£ (X), let œ := maXı<i<n ||; ||, then Ua 
U: (X); indeed, 


Y € Uha S X- VWI < Éil < e = Y CU), 


whence U(X) is a uniform neighborhood, too. In order to show that 7, is in 
general strictly finer than 7;, it is sufficient to exhibit a sequence of operators 
in BCH) which converges strongly, but not uniformly. To this end, suppose H 
to be infinite dimensional, choose a ONB {Wk }ken with associated orthonormal 
projectors P, and construct Qy := ae Pg. Then, (5.2) reads s — limy QN = 
ll; namely, if y € H and cy (i) = (WY |), then 


dim ION- DIY)? = dim YO leu? = 


n> N+1 


On the other hand, Qy — I cannot hold in the uniform sense, otherwise for any 
€ > 0 there would exist No (€) such that, if N > No(e), then || (QN — 1l)|v)|| < € 
uniformly in w € H, while ||(Qy — 1)| ~)|| = ||w|| for all w in the subspace 
orthogonal to that projected out by Qy. 

. In like manner, the weak topology cannot have more neighborhoods than the 
strong topology. Given U” (X) as in (5.8), set 8 := max) <;<p ||¢;||; then U: gX) 
C U” (X); indeed, using (5.1), 


Y € Uja (X) = Kei (X - Y) lyi) < zlé < € => Y eU? (xX). 


In general, 7; is strictly finer than Ty. Let H be infinite dimensional and, given an 
ONB {%}cen, consider the operator X : H +> H defined as the right shift along 
the ONB : 


e + = 0 k=l 
X| Pk) = | k+), NOD a oa 
Note that X'X|W%) = | %) forallk € Nsothat X'X = 1; X is an isometry with 
X Xİ projecting onto the subspace orthogonal to WY. Furthermore, by expanding 
H > |Y) = Ry cylk)| Wk ), cy (k) := (Wp |Y), it turns out that 
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IX") Dy? = (| XTX" |) = WI? 
whereas 
w— lim X” =0. 
now 


Indeed, given ¢, Y € H, e > 0 and | wx) = SL, cy(i)| Yi } such that ||| Y) — 
| Yr )Il < £, (5.1) yields 


K 
kex w| < [OL X" We )| + ewi= E gito] + en 
i=l 


K K 
X leo +m? |X ley + elvll, 


I= i=l 


IA 


where the first square-root becomes negligibly small for sufficiently large n. 

3. By adding to a subalgebra A C B(H) its limit points with respect to a given topol- 
ogy T, one obtains its closure A’. If of two topologies 71,2 on A C B(H), 7 is 
coarser than 72 (7; < 72), Tı has less neighborhoods than 72 and thus more con- 
vergent sequences; therefore, A SA” In particular, A ™ isaC *-subalgebra 
of B(H); further, since 7, > Ts = Tw it follows that A“ C A” CA™. 

4. Given a *-subalgebra A C B(H), consider the linear functionals F : A 7 > C, 
respectively F : A + C, that are continuous with respect to two topologies 
Ti < T2; more precisely, the preimages F~!(V) of open sets V C C are open sets 
in A”, respectively A ™. Then, since not all open sets in A ™ are open sets in 
A™,a72-continuous F, that is continuous with respect to the finer topology, may 
fail to be 7|-continuous, that is continuous with respect to the coarser topology. 
For instance, all weakly continuous linear functionals on A C B(H) are strongly 
continuous but strong continuity does not in general ensure that a functional is 
also weakly continuous. 

5. If A is a generic Banach algebra, its topological dual, A* is the linear space A* 
consisting of all linear functionals F : A +> C that are continuous on A. Then, 
A* can be equipped with the so-called w*-topology, namely with the coarsest 
topology that makes continuous all semi-norms of the form 


— 


W'(F)=|F(X)| WXEA. (5.10) 
Its neighborhoods are of the form 
Us" (F) := {Ge A* : LY (G-F)<e, 1<j<n}, (5.11) 


for any X; € A,n € Nande > 0. A sequence F, € A* w*-converges to F € A*, 
w* — lim, so F, = F, if limpo |F, (X) — F (X)| = 0 for all X € A. 
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5.2 C* Algebras 


The bounded operators on a Hilbert space H form a Banach x-algebra with respect 
to the uniform norm (5.3); this norm fulfils the two equalities (5.6). While the first 
one follows at once from (5.3) and the definition of adjoint, the second one is proved 
by using (5.1): 


IXI? = sup (4| XtXd) < XTX = IXI < IX"; 
lyl=1 


thus, exchanging X and XÏ, yields |X|? = ||X*||? = || Xİ X||. From the fact that 
| X¥|| < |IXI ILY || (an inequality that follows at once from (5.3)), one gets || X*| = 
|X ||; in fact, 


IXI? = IXTXY < XP XT = XI < XI, 


while |X" ||? = |XX" < |X| XT] = IX} < IX. 
More in general, let A be an algebra with an involution f : At» A such that 


(aA+ 8B) =a%* At + 6*B', (AB)'=B' al 


for all a, 8 € C and A, B € A. Let A be complete with respect to a norm ||- || : 
At R+ such that 


la Al] = lal |All, A+ BIS |IAl+ (Bll, lA Bll s |All Bll 


and || A|| = 0 => A = 0 for all a € C and A, B € A. If the norm further satisfies 
(5.6) it is called a C* norm. 


Definition 5.2.1 Any Banach «-algebra A with respect to a C* norm is called a C* 
algebra. A is called unital if it possesses an identity 1 such that A 1 = 1 A = A for 
all A € A. 


Examples 5.2.1 


1. The commutative algebras C(¥), respectively Lr (X) of continuous, respec- 
tively essentially bounded functions over a compact phase-space æ% discussed in 
Sect. 2.2.1 are C* algebras with respect to the uniform, respectively essentially 
bounded norms. 

2. Many instances of quantum systems are N-level systems; their Hilbert space is 
finite dimensional and thus can be taken as H = C™ , while their observables are 
Hermitean N x N matrices with complex entries. In such cases, the C* algebra 
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of bounded operators B(H) is the full matrix algebra My (C). Given an ONB 
{wy in CY, set Bij := | Y; (W; l,i, j =1,..., N; then, 


N 
Eily) = (Wily) Yi), EijEre = jk Eie , X Ei =1. (5.12) 
i=l 


Any set of matrices with these properties constitutes a set of matrix units, Ej; 
being a complete set of orthogonal projections. Any X € My(C) can be thus 
expressed as a linear combination of the matrix units: 


In the standard representation where the basis vectors have the form 
T 
|w;) =(00--- 1 --- 00) , 


with 1 in the j-th entry, the matrix units E;; are N x N matrices whose entries 
are all 0, but for the i j-th one which is equal to 1. 

. A typical scenario often encountered in quantum physics is as follows: a quantum 
system described by a generic (not necessarily finite-dimensional) Hilbert space 
H is coupled to an N-level system, the corresponding algebra being the tensor 
product My (BCH)) := My (C) ® BCH) consisting of operators of the form 


Xni--: XNN 


where X;; are operators in B(H) and £;; are matrix units in standard form. The 
tensor product My (B(H)) is a x-algebra of operators acting on the Hilbert space 
H := C” & H consisting of vectors 


N 
Ñl =X li)@lv)y=] oi |. (5.14) 


A uniform norm on M y (B(H)) is defined by 


IXI? = sup (| XX i) 
lyl=1 
N N 
=} >> (IX Xi lbs): Ylse 615) 
i=l 


i, j,k=1 
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Indeed, it turns out that it satisfies (5.6); beside, My (BCH)) isa complete «-algebra 
with respect to it, thence a C* algebra. 

4. Given X, Y € BCH), BCH) > [X, , Y] := XY — YX denotes their commutator. 
Let V C BCH) be a linear self-adjoint subset, that is it contains the adjoint of 
any of its elements. The commutant of V, denoted by V’, consists of all bounded 
operators that commute with all X € V. If X’ € V’ also (X’)* € V’; further, if 
X', Y' € V', then 


[x’y’, X] = X'[Y', X] + [X', X] Y =0 VXev. 


Therefore, X’Y’ € V’ and V’ is a x-algebra; also, if a sequence of Xj, € V’ uni- 
formly converges to X’ € B(H), then, 


IXS XJI = IEX- X, XI < 21X- XI XI 


implies that V’ is uniformly closed and thus a C* subalgebra of B(H). 

5. If A C BCH) is a C* subalgebra with commutant A’, the center of A, ZA := 
AN A’, contains all those A € A that commute among themselves and is thus an 
Abelian C* subalgebra of B(H). 


Given a bounded operator A in a unital C* algebra A, a — A is invertible in A (a 
stands for all) if there exists B := (a — A)~! € A for which 


(a — A)B = B(a — A)= 1. 


The set of such a € C is called the resolvent set of A (Res(A)). Its complement is 
the spectrum of A (Sp(A)). 


Examples 5.2.2 ([80]) 
1. Let A € A anda € C such that || A|| < |a|, then by Taylor expansion 


7-2>(4) 


n=0 


the series converges in norm and gives rise to a well-defined operator in A. There- 
fore, Sp(A) is contained in the subset of a € C such that |a| < || A||. Further- 
more, if ag € Res(A) so that (aj — A)~! exists in A, choose a € C such that 
la — ao| < ||(@o — A) "I; then, 


My 1 © 1 5 ap—a\" 
a—A (a-a)+ta-A a-A Z ao — A 


exists as well, whence Res(A) is an open subset of C and Sp(A) a closed subset 
of C. 
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2. 


For a,b € Cand A € A, a — (b — A) is invertible if and only if (b — a) — A is 
invertible. Thus Sp(b — A) = b — Sp(A). 


. For a € C and A € A; a — A is invertible if and only if a* — A’ is invertible, 


whence Sp(A') = Sp(A)*, the conjugate set of Sp(A). 


. If A is invertible, using A~! one writes 


a— A =aA(A7! sas a7! — A`! Sa A (A — a). 
Therefore, if a — A is invertible, then a7! — A`! turns out to be invertible too 
and vice versa; therefore, Sp(A7!) = (Sp(A))~!, the set consisting of the inverse 
of each element of Sp(A~!) (notice that 0 ¢ Sp(A) and a € Sp(A) => |a|~! < 
IAT} < +00). 


. The spectral radius R(A) of A € A is [80] 


R(A) := sup{làl : à € Sp(A)} = lim A". 


An operator A € A is normal if At A = A A‘; then, R(A) = || A||. Indeed, the 
C* properties of the norm yield 


n n n n n—1 n—1 
IA? I? = (457 A” |] = IAA = atA AtA TI] 
IA? AT = AT AI?” = AI?" , whence 
lim JA”? = |All. 


n— +00 


R(A) 


. Self-adjoint A 5 A = AŤ are normal and hence R(A) = ||A|l. 
. If U is unitary (Ut U = U Ut = 1) or isometric (UÝ U = 11), then 


Jun = Ty" u” = UD" = yy a 
=(0' U| =i] =1. 
Therefore, Sp(U) is contained within the unit circle {x € C : |z|| < 1}. On the 


other hand, if U is invertible, UT! = UÏ and the preceding point 3 implies that 
Sp(U) = {zEC: lz) = 1}. 


. If A> A = At and |a|~! > |All, then from point 1 above one deduces that 


—ila|~! — A = —ija|~'(1 — iJa|A) is invertible so that 
A> U :=(1+ilalA)( — ila|A)~! 
is a well defined unitary operator; moreover, the last point ensures that 


1 — iļa|z y 2ija| (A—p +i I4)! 
: = : Z ila 

1+ilalz 1+ila|z 

—S 


w 


is invertible whenever |w| Æ 1, namely whenever 3(z) 4 0. Therefore, A — z is 
invertible and Sp(A) C [—]|All, || A||] because of points 3 and 6. 
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9. Let P(z) be a polynomial of degree n on C, A € A and fora € C write 


P@)-a=a] [@-a), a,a EC 


i=l 


P(A)—a = a] [(A-aj). 


i=l 


The operators A — a; commute, thus P(A) — a is not invertible if and only if at 
least one of them is not invertible, that is a € Sp(P(A)) if and only if at least one 
aj E€ Sp(A). Since P(a;) = a, it follows that Sp(P(A)) = P(Sp(A)), the set of 
values attained by P(z) on Sp(A). 

10. Suppose A> A = A’, from the previous result and point 8 it turns out that 
Sp(A*) = (Sp(A)*) € [0, |A]. 


Remark 5.2.1 If A € A is self-adjoint, then by the density of the polynomials in 
the commutative C* algebra of continuous functions f over R, one can extend 
Example 5.2.2.9 to f (Sp(A)) = Sp( f (A)). This is the spectral mapping theorem [80, 
385]. 


5.2.1 Positive Operators 


Particularly important bounded self-adjoint operators are the positive ones, that is 
those whose spectrum consists of non-negative values; from a physical point of view, 
they represent observables that, when measured, always returns a positive outcome. 


Definition 5.2.2 An operator A of a unital C* algebra A is positive (A > 0) if 
A = A‘ and Sp(A) C R+. Given A, B € A, one sets A > B whenever A — B > 0. 


Remark 5.2.2 Positive operators A > A > 0 are characterized by being of the form 
A = BÝ B, for some B € A and by having a unique positive square-root V/A such 
that A = VAVA [80] (see also Example 5.3.4.2). 

When A = B(H), the positivity of a self-adjoint operator X € B(H) amounts to 
(W| Xy) > Oforally € H, which corresponds to the positivity of all its eigenvalues 
x; € R such that (X — x;)|w) = 0 for some | Yy ) € H. Denote by |X| := VX? X the 
unique square-root of the positive operator X X. The map V : Ran(|X|) > Ran(X) 
defined by V|X|| Y) = X| y), where Ran(X) denotes the range of X € B(H),! is 
a partial isometry, 


VIX byt = (PI XTX |b) = XII. 


' The range of X € B(H) is the linear subset of vectors of the form |Y} = X| ¢ ) for some ¢ € H. 
Ran(X*) L Ker(X), where Ker(X) is the kernel of X that is the closed subspace of vectors Y% € H 
such that X| Y ) = 0. 
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Let U denote the partial isometry which equals the extension of V on the closure 
of Ran(X) and 0 on Ker(|X|) = (Ran(|X|))+, then X = U|X| is the so-called polar 
decomposition of X [80]. It is unique; namely, if X = VB with B > 0 and V isa 
partial isometry with V = 0 on Ker(B), then 


X'X = BV'VB = B’ = B = |X| 


by the uniqueness of the square-root; further, U = V for both annihilate Ran(|X|)+. 
The projection p := U' U is called the initial projection of U, while q = UU” its 
final projection. 

In the simplest cases BCH) = My (C), then |X| = /X7X can be spectralized, 
|X| = J; x| Wi )( % |. The eigenvalues x; > 0 of |X| are the so-called singular 
values of X, while its eigenvectors W; form an ONB in C™. Using the polar decom- 
position, it turns out that any matrix can always be represented in terms of its singular 
values and of two, generally different, ONBs , 


N 
X=UIX|= Sox lO/ Yl, 1O;) = UY). (5.16) 
j=l 


Also, if V is the unitary matrix that diagonalizes the Hermitean matrix |X|, |X| = 
V D VÝ, then X = W D VÝ with W := UV unitary. 


Examples 5.2.3 ([11,348]) 


1. From Example 5.2.2.10 it turns out that the spectrum of the positive elements 
A > 0 ofa C* algebra A is such that Sp(A) © [0, || A||]. 

2. Suppose A > B > 0 for A, B € A; then, from the previous remark, A — B = 
Ci C, whence 


D'(A— B)D = (CD)' (CD) > 0 => D AD > D'BD, 


for all D € A. A typical situation is when A = P, an orthogonal projection which 
is always < Il, then DİPD < DÝ D. 

3. Let BH) 5 X := Py — Py =|)(v|—-1¢)( l, Y 4 ¢ eH. One can always 
write 


ld) =alb) +y), (olor) =0, a:= (416), 8=vV1- lal. 


Then, on the subspace K spanned by 7 and 7+, X is represented by the 2 x 2 
matrix 
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Thus X has eigenvalues +( and eigenprojectors 


TÆ 1 
LEL yy g ee EE yt), 


U 
TT 
~~ 


where eŻ? is the phase of a. Therefore, 


X=681+hK+1-I-M-D, [Xl=6 Ox, 


where Ox =|+)(+|+]|—)(—|] projects onto K. Further, U = 67!X is an 
isometry on K that vanishes on K+. 
4. If X = U |X| then X? = |X| UŤ whence 


XX =U]|XP U =UXİXUŻ. 


Therefore, Xt X and X X" have the same eigenvalues with the same degeneracies 
apart, possibly, from the eigenvalue 0. 

5. Suppose 0 < X < Y, X,Y € B(H), then y`! < KL Indeed, X—! and Y7! 
exist; thus one can set Z := Y7 !/2 XY- !/2. Then, Z < Il. In fact, (see Defini- 
tion 5.2.2) for all y € H 


(YI PPX YM? ly) = (Yp X Y) 
< (Yy YY Pw) = (l). 


By the same argument, multiplication of both sides of the inequality Z < Il first 
by Z7! = Y!/2X7!Y!/? yields 1 < Y'/?x—!y'/?; one then multiplies both sides 
of this inequality by Y~!/?. 

6. Let X < Y € B(H) be such that log X exists (see Remarks 5.2.1 and 5.3.4), then 
log X < log Y. This follows from the spectral calculus and the previous point, for 
t+X <t+/Y forall t > 0 and the fact that 


x TOS 1 eo 1 
log = f dt -f dt —, Yx,y>0. 
y Jo t+x Jo t+y 


+00 1 +00 1 
Then, log X — logy = | dt -f dt — <0. 
0 t+X 0 t+Y 


7. Consider the setting of f Example 5.2.1.3, that is the C* algebra A M n (E B(H)); X >0 
if and only if (4% | X W) = D Fz (Wi | Xi lj) = 0 for all w c Ñ. By arbitrar- 
ily choosing Y; € B(H), ¢ € Hand setting | Yi ) := Y; | ¢ ), it follows that X>0 
if and only if 


Xi, +++ Xin Yi 


N 
Drxwy=(i i o a fp pee 
i=l Xyi-+: XNN Yn 


for all Y; € BCH). 
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8. Also, X > 0 if and only if X = Y'Y, Y € My (BŒD). Then, 


N N N 
X = J Ej Exe 8Y} Yu =} | J Eye Y} Yee 
Te k=1 \j,e=1 


kjt= 


Therefore, X > 0 if and only if each of its elements is a sum of matrices of the 
form [Y; Y;], Y; € BŒ). 

9. Consider the N x N matrix E := [E;;] € My (My (C)) whose entries are matrix 
units E;; € My (C). According to the previous point, E > 0 ; indeed, 


N N 
E= Ei; ® Ej = J Ej Q Ej, Erj VRS G ys cg N 
i j=l i, j=1 


Let {|i jae be the ONB such that E;; = | i }( j |; then, E turns out to be propor- 
tional to the orthogonal projector PY, 


N 
be ie Set 1 ae ek 1 ~ 
PP =P MEP = DEMONS gE 617 
i,j=l 
onto the totally symmetric state 
1 N 
CY 8C ə |N) := — Ñ` jii}. (5.18) 
a22 


5.2.1.1 Finite Dimensional Algebras 
A C* algebra A is finite dimensional if its dimension as a linear space is finite; 
as such it has an identity Il. In particular, its center Z4 (see Example 5.2.1.5) is a 
finite dimensional algebra whose elements all commute: such an algebra is called 
Abelian. It is generated by minimal projections { P;}"_, (see Example 5.3.4). Due to 
their orthogonality, A = @?_, A;, where A; := A P;, with P; its identity operator, 
whence their centers are trivial and the A; simple algebras. In fact, A; cannot contain 
any non trivial ideal i C A, for, like A, also i has an identity E [348]; then, XE € 
i= > EXE = XE sothat E commutes with all self-adjoint X € A; and thus belongs 
to its center: EX = (X E) =EXE=XE. 

We shall set A = A; and show that it is isomorphic to a matrix algebra by con- 
structing an appropriate system of matrix units (Ee ae j= 

Let B C A be a maximally Abelian subalgebra with minimal projectors {Q ; K- 1 
such that Q ; Ox = ôjk Q j and De Q; = 1. For each Q j, one can always choose 
X; € A such that Y; := Q;Xj;Q, # 0; indeed, for all X € A, iy := {} -1 X; X 
Y; : Xi, Y; € A} is an ideal of A which, as A is simple, must coincide with it. 
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Observe that iy, = O1X;QjXjQ1 and nri - QjXjQIXÝO} commute 
with B; since $ is maximally Abelian, they belong to it. Thus, Y iy 7 =AjQ1 
and ¥;Y} = pj Q j. Further, Aj = pj > 0, for ||¥; ¥;ll = IY; YŤ ||. By setting Zj := 

_— 7.71: TZ : .- 771g. 
Se ee = Zi Z; it follows that Z; Z; = Q1, while Ej; = Zj Z; = Qj for 


Qi Xj Q1X;QjQpXpQ1X4Qq 
ol MAG DG 


vy, yi 
—+ —— 


T 
QiXiQı Qix} 'O;Xj;Q1 QIX} Qq 
= djp = OjpLig - 
JV Airg Aj 


Thus, the £;; are the required set of matrix units as they linearly span A: for all 


X eA, ZİXZj = (01X} Qi X O;Xj;Q1)// AA; = pij(X) Q1, whence 


ij E pg = 


Yj 


d d 
X= J OX 0) = }_ ZZ) XZjZi = 2 ij (X) Zi Q1Z; 


i,j=l i,j=l i j=l 


II 
M= 
E 
6 
& 


Therefore, any finite dimensional C* algebra is isomorphic to the orthogonal sum of 
full matrix algebras: A ~ @i'_; Ma, (©). 


5.2.1.2 Compact Operators 
If H is infinite dimensional, the matricial structure of My (C) carries over to the 
so-called compact operators, Boo (H). These are all X € BCH) such that |X| has a 
discrete spectrum of finitely degenerate eigenvalues that accumulate to 0, the only 
eigenvalue with possibly infinite degeneracy. It turns out that these spectral prop- 
erties are preserved by linear combinations and operator multiplication [296,317]. 
Practically speaking, compact operators are obtained by closing with respect to 
the uniform norm the «-algebra of finite rank operators, that is of the linear span of all 
possible X on H that are non-zero on finite dimensional subspaces, only, where they 
can be represented as usual matrices. As such the algebra of compact operators is a 
Banach «-algebra without identity operator for the only eigenvalue of Il is infinitely 
degenerate. 


2 The simplest example of compact operator is any projector P = | ~)(w| which vanishes on the 
orthogonal complement of 7, whence its zero eigenvalues is infinitely degenerate when H is infinite 
dimensional. 
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5.2.1.3 Trace-Class Operators 
Consider a matrix algebra My (C), the functional Tr : My (C) > C, 


N 
Myn (C) > X & Tr(X) := XO (WIX |) ; (5.19) 
i=l 


where {Wj pa , is any ONB in C^, defines a so-called trace on My (C). 
The trace is basis-independent; indeed, because of (5.2); given any two ONBs 
{Oj}, and {WL 


N N 
X @j|X1Oj) = SY (Dj MH) YR X |e) elj) 


j=l jk t=1 
N N N 
Y (XIE Dj Ye) (IX We) = (%|X%). 
ké=1 j=l k=1 
i 
Oke 


The trace of a matrix amounts to the sum of its diagonal entries, so Tr(X) > 0 if 
X > 0. Further, it is cyclic; namely, for all X, Y € My (C), (5.2) yields 


N N 
TXY) = X (WXY |W) = Do (GLX IW) IY) 
i=1 i,j=1 


(WY [Wi )( |X (|W; ) = Tr(Y X). (5.20) 


II 
Me 


Using the trace, one constructs the following map from My (C) onto R*, 
ln: Xe Xh = TeX] = Dox, (5.21) 


where x; are the singular values of X (see (5.16)). This map vanishes only if X = 0; 
also, from (5.4) and (5.16) it follows that 


N 
[Tr(Y X)| < $ xY |YU |Y) < YIX I (5.22) 
i=l 
[TrX| = |Tr(U|X])| < IXI (5.23) 
IX + Zli = Te(UŻ(X + Z)) < IX + IZI - (5.24) 


Therefore, || - ||] is a norm on My (C) called trace-norm. 
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Remark 5.2.3 Let X' = X € B(H); then, using its spectral decomposition (for sim- 
plicity, we shall consider a discrete spectrum), it can decomposed into its positive 
orthogonal components 


X= Do alel- do alam, 
icl} icl 


where x; with i € I4 are positive eigenvalues and —x; with i € I— are negative ones 
with eigen-projectors | x; ) (x; |. Then, 


IXI =Vxtx=X,+xX_, 
IX = TX} + TeX_ > TrX4 — TrX_ =Tr(X). (5.25) 


If X Æ XÏ then it can be first decomposed into Hermitean operators, 


X+xXt X — x? 
T » X2 := 5 
2 2i 


X=X +iX2, Xı:= 


which can in turn be decomposed into their positive components 


J = +. Xil X12 
X= Xh- X Xh s SS. 


Therefore, any X = X‘ € B(H) can be written as a complex combination of four 
positive operators X7",: 


X=xX{f — X +ixf-ix;. (5.26) 


If extended to B(H) with H infinite dimensional, the trace selects the linear sub- 
space Bı (H) c BCH) of trace-class operators: 


Bı (H) := [x ce BŒ: |X| < oo] . 


If X € Bı (H) then, by the polar decomposition X = 7, Xn| Gn )( Yn |, where {bn} 
and {w,} are two ONB in H, x, are the eigenvalues of |X| = XX and the sum 
converges in trace-norm; also, || X ||; = ear Xj. 

Then, inequality (5.22) holds with Y € BCH), X € Bı (H), (5.23) and (5.24) with 
X, Z € B,(H). The trace-class operators thus form a x-algebra? ; Bı (HI) is also 
closed with respect to the trace-norm and thus a Banach x-algebra, without identity 
in infinite dimension for Tr(1l) diverges [296,317]. 


3 Bı (H) is a two-sided ideal, namely Y X, XY € B,(H) whenever X € B,(H) and Y € B(H). 
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Example 5.2.4 ([80]) Any X € BCH) defines a linear functional 


Fy: BARC, pr Fx(p):=Tr(Xp) Vee Bi (A), 


on B,(H) which is bounded for |Fy(p)| < ||X|| ||pll1. Therefore, BCH) can be 
identified with a subspace of Bı (H)*, the topological dual of Bı (H), that is the 
(Banach) space consisting of all continuous linear functionals on Bı (H). Actu- 
ally, BCH) = Bı (H)*. Indeed, let F € Bı (H)* and consider the bounded operator 
|>)(w| with ¢, Y € H not necessarily normalized. It is also trace-class; indeed, set 


Py = 10) b/s then, 


ee tlh = Te(y lel? Iyl? Py) = lol lll. 


It thus follows that |F(|6)(wv|)| < IFI ioll lyi] for all ġ, Y € H. Therefore, each 
F € Bı (H)* defines a so-called sesquilinear form on H x H, linear in the first argu- 
ment, antilinear in the second one and continuous with respect to both. Consequently, 
there exists an unique operator Xp € B(H) such that F(| 6)(w |) = (Y| Xr |¢) for 
all ¢, Y € H. Before proving this fact, we draw the conclusion; as already noticed, 
any p € Bı (H) can be written as p = $. ra| On) ( Wn | with the possibly infinite sum 
converging in trace-norm; thus, Bı (H)* c B(H) for 


F(p) =} rn F(l bn (bn D = D> rn (dn | XF Ibn) = TeX p) - 


n n 


The property of sesquilinear forms used above comes as follows: if f : Ht» Cisa 
continuous linear functional on H, that is | f(w)| < || f I| \|w||, then Ker( f) is closed. 
Assume Ker( f) Æ H; if ¢ € Ker(f)+, |||] = 1, then f(¢) 4 0 and 


Ker(f) 31x) = FIY) - FON) = FH =(41¥), 


where | P) := f(@)|¢@). It is easily seen that this vector is unique and that || f || = 
|f ()|. Given a continuous sesquilinear form f : H x H — C, for each fixed Y € H 
it defines a continuous linear functional fy : Ht C; therefore, there exists a 
unique | yy) € H such that f(¢, Y) = ful(d) = (xul1¢). This allows to define 
a linear operator xi, € B(H) such that Xx; W) = |xw) and, whence f(¢, Y) = 
(p| Xf 10). 


Remark 5.2.4 As B(H) is the dual of B; (H) it can be equipped with the correspond- 
ing w*-topology (see Remark 5.1.1.5); namely, any p € Bı (H) defines a linear func- 
tional B(H) 5 X +> E,(X) := Tr(X p). The wx-topology on B(H) is the coarsest 
one with respect to which all semi-norms £,(X) := |Tr(p X)| are continuous. By 
comparing these semi-norms with those in (5.9), it turns out that the w* topology 
coincides with the o-weak topology. 
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5.2.1.4 Hilbert-Schmidt Operators 
A second norm on My (C), also based on the trace, is given by 


N 
l-2 : X> IXl2:= VTX? = |} x. (5.27) 
i=l 


Itis called Hilbert-Schmidt norm; unlike the trace-norm, it originates from a (Hilbert- 
Schmidt) scalar product 


My(C) x My(C) > (Y, X) Tr(¥"X). (5.28) 


that satisfies |Tr(Y' X)| < || ¥||2|| X||2. In fact, using (5.16) and (5.1), 


N N N 
TOD < So xi|(W PTUs )| < | doa? YOU ly)? 


i=l j=l 


i=l 
N 

< |Xllo EZ [YU]; yY; (UŻY (4) < JXl IY lle, (5.29) 
j=1 


for U| Y; )(W; |U t < IL. When defined on B(H) with H infinite-dimensional, the 
Hilbert-Schmidt norm singles out the linear subspace B2 (H) of Hilbert-Schmidt 
operators 


B: (H) := [x € BC): IXI < oo] ’ 


If X € Bo(H), ||X|l2 =,/ Eo 1 a, with x; the singular values of X. Then, inequality 
(5.29) holds with X, Y € B2(H), respectively Y € BCH), X € B2(H). B2(H) is also 
close with respect to || - ||2 and thus a Banach «-subalgebra of B(H) (actually also a 
two-sided ideal as B, (H)) without identity in the infinite dimensional case for || Il ||2 
diverges [296,317]. 


Example 5.2.5 Let Fj, j = 1,2,..., N?, be a set of N x N matrices, orthogonal 
with respect to (5.28), Tr(F; Fx) = 6jx: they form an ONB in My (C). Indeed, as 
a Hilbert space equipped with (5.28), Mn (C) has dimension N 2 therefore, for all 
X € Mn(©), 

N2 

+ 
My(C) 3 X=) Tr(X F}) Fj. (5.30) 

j=l 

Consider the linear map Try : My (C) > My (C) defined by 


My(C) > X > Try[X] := Tr(X) Iy . (5.31) 
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We shall refer to it as trace map. Choose an ONB {| a Y in C”; the N? matrix 
units Eag := | a){ 2 | also form an ONB in My (C). Thus, there must exist a unitary 
matrix U € Mp2 (C) such that Eag = yr Uag,i Fi. Therefore, the trace-map can 
be recast as 


N N? N 
Trn[X] = X. EpaX Eag= X YO Učg; Uys; F} XFj 


a,ß=1 i,j=1 a, ß=1 
7,6=1 


= Fİ XA. (5.32) 


Remark 5.2.5 The uniform, trace and Hilbert-Schmidt norms are all equivalent on 
finite-dimensional H and thus define equivalent topologies with the same converging 
sequences. Indeed, given X € My(C) its norm coincides with its largest singular 
values, || X || = maxı<i<y xi, then 


IXI < UX. Xh <N IXI, IX < 1X2, 1X2 < VN IXI. 


Also, ||X|l2 < ||X||1 for the sum of squares of positive numbers is smaller that the 
square of their sum; while from (5.1), 


N 
IX =% x = 
i=l 


However, the trace and Hilbert-Schmidt norms are not C* norms; indeed, for any 
N eN, 


2 N N 


N N 
Ixtxh = oa? (©) xx= |}. 
i=l i=1 


i=1 i=l 


Therefore, the trace-class and Hilbert-Schmidt operators form Banach -algebras 
but not C* algebras. 


5.2.2 Positive and Completely Positive Maps 


Physical transformations of quantum systems are described by linear maps acting 
either on their observables or on their states. As already seen in the classical case, 
these two possibilities are dual to each other; in the first case states are not affected, 
in the second one, states change while the observables do not. The physically rele- 
vant request is that mean values do not depend on which of the two ways they are 
calculated. 
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In the classical setting, states must change while preserving their ultimate char- 
acteristic of being probability distributions; the maps which describe classical state 
transformations must thus be positivity preserving. In quantum mechanics things 
are more complicated and intriguing; it is indeed necessary to sharpen the notion of 
positive linear transformation. This latter is as follows. 


Definition 5.2.3 (Positive Maps) A linear map A : B(H) > B(H) is positive if and 
only if BCH) > X > 0 => BH) 5 A[X] = 0. 


Given a positive linear map A, one can always lift it to act on the operator-valued 
algebras My (B(H)) as 


A[X] +++ AX] 
idy 8 A: [X;] > [A[Xij]] = l 7 E | (5.33) 
AR a Aww] 
One may then ask whether idy @ A is positive, too. 


Proposition 5.2.1 Positive map A are hermiticity-preserving, A[X‘] = A[X]. 
Trace-preserving positive maps are contractive with respect to the trace-norm in 


(5.21): 


| AX] < Xl VX =X? e BH). 


Proof If A is positive, using (5.26), and the positivity of the four operators X7,, 
one gets 


A[X]' = AXT] — A[X7] — i ALXS] + i AIX] = AX) i X2] = ALX"). 
If X > 0, and A is positive and trace-preserving, then A[X] > 0 and (5.25) yields 


| ALX] |], = TALX) = Tr(X) < IX. 


Vice versa, if A is trace-preserving and || A[X]||1 < ||X|| for all X = Xt € BŒ), 
then (5.25) gets 


WALX] |], = Tr(ALX]) = Tr(X) = | Xll, 
so that 
WALX] |] = |X] = Tr(X) = Tr(ALX]) 


which is only possible if A[X] > 0. 
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Definition 5.2.4 (Completely Positive Maps) A linear map A: 
N-positive if and only if 


idy @ A: My B) & My B) 


B(H) => 
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B(H) is 


is positive; A is completely positive (CP if and only if it is N-positive for all N. A 
linear map A : B(H) +> BH) is called a CPU map when it is CP and also unital, 


that is A[] = 1. 


Proposition 5.2.2 CPU maps are Schwartz-positive, namely they satisfy the inequal- 


ity 
A[X? X] > A[X']A[X] = 0. 


CPU maps are contractive on B(H) with respect to the uniform norm. 


Proof From Example 5.2.3.8 it turns out that 


Il X 1 O\ /1 xX 
(x Pee J (6 0) =o VX € BH). 


If A is CPU, it follows that 


1 X 1 A[X] 
RA (x oy) = ag Fee, z0, 


whence, using Example 5.2.3.7, 


— Il 


(5.34) 


(A[X*] —1) a AP ) (4) = A[X'X] — A[X"]A[X] 


Xt] ALXTX] 


>0. 


Since B(H) > X'X < ||X||’, positivity, unitality and (5.34) yield 


A[X]' A[X] < A[X'X] < |X|? 


whence || ALX]|| < [|X|]. 


Examples 5.2.6 


1. Positivity corresponds to |-positivity; complete positivity is stronger for there 
are positive maps which are not 2-positive, a renown example being transposi- 
tion T2 : M2(C) > M2(C), T2[X] = X", with respect to a fixed ONB. We shall 
consider the N-dimensional case: Ty is positive for transposition does not affect 
the spectrum of a matrix, but the partial transposition idy ® Ty is not positive, 
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whence Ty is not N-positive and thus not CP . This can be seen by considering 
the positive matrix E = [E;;] of Example 5.2.3.9. Then, 


N 
V = idy @TwlE] = N idy @TwlPP] = DOA AI@lAMEL (5.35) 


i,j= 


= 


acts as a flip operator on C @ C™, that is v(i Y) g 1#) = |$) Q |). Since 
V has eigenvalue 1 on the N(N + 1)/2 dimensional subspace of symmetric states 
and —1 on the N(N — 1)/2 dimensional subspace of anti-symmetric states, it is 
not positive and Ty not N-positive, hence not completely positive. 

2. Let B C A a C* subalgebra of a C* algebra A, both assumed with identity, 
the map 13,4: Bt A denotes the natural embedding of B into A; according 
to Examples 5.2.3.7 and 8, embeddings are CPU maps. Indeed, for all A; € A, 
B; € BandN €N, 


N 
X` A} wpalB} Bj)Aj = Z'Z>0. 
i,j=l 


3. If A and B are C* algebras (with identity) and 5 is Abelian, then any positive map 
A:At Bis CP. In order to check whether ar YI ALx! X,Y; > 0 for all 


Xi,j € A, Yi j € B andn €N we use that Y; j and A[X} Xj] can be identified 
with functions f(x) € C(4’) over a compact topological space X. Then 


(2. Yİ ADG Xj]¥j)(@) = Y OA XWA) 


i,j=l i,j=l 
= A[Zİ Z (x) = 0, 


where Zy := $; Yi(x) X; € A forall x € Æ. 

4. If A and B are C* algebras with identity and A is Abelian, then any positive map 
A: A> Bis CP . We take B = BCH) and check whether Xij Vi | A[X! 
Xj] |W; ) = Oforally; € H,n € Nand X; € A identified with functions in C (7). 
By duality, AT [| ~ j )( Wi |] gives a complex measure on 4’ such that 


n 


XO (vi | ALX} Xv; ) "3 f, duja) X OX = 0. 
i,j=1 i, j=1 
Indeed, for all c; € C and | Y ) := $>; cil vi), 


Eca f dui j) = Yo (as LALIN) = (| ALI) = 0 


i,j=l t,7=1 
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In order to ascertain whether a linear map is completely positive, it seems nec- 
essary to check N-positivity for all N € N. Luckily, the following result [101,102] 
shows that N-positive maps A : My (C) — BCH) are automatically CP. 


Theorem 5.2.1 (Choi) A linear map A : My (C) > BH) is CPifand only ifidy ® 
A[E] > 0, where E is the matrix introduced in Example 5.2.3.9. 


Proof As seen in Example 5.2.3.9, E > 0; thus the “only” if part follows from the 
fact that if A is CP it is N-positive (see Definition 5.2.4). 

As regards the “if” part, in order to check that A N-positive implies A CP , we 
choose M € N arbitrary and show that idy & A[X] > 0 for all My(C) ® BCA) > 
X >0. Using Example 5.2.3.8, it is sufficient to show that aye Y` A[X}X lY; > 
0 for all choices of Y; € BCH) and X; € My (©). Then, by writing X; = cs x a 
Exe € My (C), from the assumed positivity of [ALE I a and Example 5.2.3.7 it 
turns out that 


M N M t M 
D rainy- Y ( se) MAEA 
1 j=l 


i,j=l k,l,s=1 


i= 
— am 
Zks 


N N 
=y Y 2 Alen Za 205 
k=1 £,s=1 


whence A is M-positive for all M € N and thus CP. 


Remark 5.2.6 The matrix X4 := idy ® ALE ] € My (My (C)) associated with any 
linear map A: My(C) +» My(C) is known as Choi matrix. Theorem 5.2.1 can 
then be rephrased as: A : My (C) > My (C) is CP if and only if its Choi matrix is 
positive. 

Vice versa, let X = }7;. jis,r Xis jr Ej; 
Ej; ~) and E 1u) denote the matrix units in My (C), respectively Mm (C). The map 
A x: Mn © > Mm(C) defined by linear extension of 


E™) ® E) bean NM x NM matrix, where 


Ex) > Axle): a5 BY Xisjr, (5.36) 


r,s=1 


is such that its Choi matrix is X. Denoting by L(V, M) the linear space of linear 
maps A : My(C) —> My (C), the one-to-one relation 


L(N, M) 3 A <-> X € Mym(C) 


is known as Choi-Jamiotkowski isomorphism [101,190]. 
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If a map A: My(C) + B(H) is only positive, its Choi matrix cannot be positive, 
but only block positive, namely only its mean values relative to product states in 
CY & Hare surely non-negative. 


Proposition 5.2.3 A linear map A : My (C) +> BH) is positive if and only if (Y ® 
lidy ® ALE] |b @ ¢) = Ofor ally € CN and ¢ € H. 


Proof Positive matrices can be written as sums of projectors with positive coeffi- 
cients, thus, according to Definition 5.2.3, A : My (C) + B(H) is positive if and 
only if (¢| Aly) |] |) = 0 for all 6 € Handy € C™. But, 


M 
D VOU) (¢| ALEi1 19) 


i,j=l 


(pA) Io) 


N 
(4*1 >> Ey 8 ALEjllY* @ g) 


i,j=l 
= (4* @ ¢ lidy ® ALE] |\v* @ 4), 


where | 7)*) = yY w*(i)| i) with respect to the fixed ONB in C™ such that Ej; = 
[iX |. 


In finite dimension the structure of CP maps can be made explicit by means of 
the Hilbert-structure inherited by My (C) from the Hilbert-Schmidt scalar product 
(5.28). 


Examples 5.2.7 


1. Consider the linear space L(N, N) of linear maps A : My (C) —> Mpy (O); it can 
be given a Hilbert space structure by means of the Hilbert-Schmidt scalar product 
of their Choi matrices, 


<< Alá >>:= Tr(idy @ ALE J" idy 9 AE E}). 


Consider an ie { Fj i ı in My (C) and the maps ®;; € L(N, N), defined by 
0; ;[X] := Fi X Fj. They satisfy << ®;j | Oke >>= ðikð je and therefore form 
an ONB in L(N, N), whence 


N2 
AIX] = X Ly FX Fj), Lij =<< ®ij| A >>= Tr(F} ALF), (5.37) 
i,j=l 
for all A € L(N, N). If A preserves hermiticity, the N? x N? matrix of coeffi- 
cients A;; is Hermitean and can be diagonalized, Lj; = m Lk vě Vkj. Suppose 
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the eigenvalues £z are positive, then 
AIX] = 7 GEXGe, Gk:= aS Vij F. (5.38) 


Using Example 5.2.3.8, maps l € £L(N, N) of the form F[X] = Gİ XG are 
easily proved to be CP . Therefore, linear maps A : My (C) + My (©) of the 
form (5.38) are CPand CPUif = Gi Gx = ll y. Notice that the decomposition 
(5.38) is highly non-unique; another possible decomposition is indeed provided 
by (5.37) if the matrix [L;;] is positive. 

2. Let Try : My(C) > My (C) be the trace map of Example 5.2.5 and consider the 
reduction map [182] A: My(C) ® Myn (©), 


A[X] = Try[X]—-X. (5.39) 


A is positive, but not CP; positivity follows since, if X > 0, Try [X] is not smaller 
than any of the eigenvalues of X. On the other hand, using (5.17), the Choi matrix 
of A turns out to be idy ® A[E] = Iy2- NP; it has a negative eigenvalue 
1 — N, whence A cannot be CP. 

3. The transposition Ty : My (C) —> My(C) is the paradigm of a map which is 
positive but not CP (as it is not N-positive); by combining Ty with A in (5.39), 
A:=A o Ty : Myn(C) |> My(C) is CP. Indeed, using (5.35), idy ®© ALE E] = 
idy 8 A[V] = lly2 — V = 0, for the eigenvalues of the flip operator are +1. 


The next two results [214,339] show that the CP maps are completely characterized 
by a structure as in (5.38). 


Theorem 5.2.2 (Stinespring Dilation) A unital map A : B(H) + B(K) is CPU if 
and only if there exists a triplet (K4, na (B(R)), Va), where K4 is a Hilbert space, 
Va: Kb Ky an isometry and 7, : BCA) > B(Ka) a representation of B(H) on 
K4 such that 


A[X] = VÍ wa(X) Va. (5.40) 
The triplet (K4, 7,(B(H)), Va) is unique up to unitary equivalences. 


Proof If A has the form (5.40) with vi Va = Ix, Example 5.2.3.8 shows that A is 
CPU . To prove the converse, consider the linear span of all elements of the form 
X QY, X € BCH) andw € K, 


dX @di : Xi e BM), w enl 


and the sesquilinear form (-|-) 4 : B x B tb C defined by 


XIY. YOH (XOVlY @d)a:= (Y| AXŻY] le). (5.41) 
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If A is CP, this sesquilinear form is positive on 8 x $ (see Example 5.2.3.7), 


N N N 
(SOX Owl YO Xoya = Do (gi AL] Xj) 
i=l j=l i j=l 


N 
= MES Eyllv)>0 VNEN. 


By considering the quotient of & by the kernel of the sesquilinear form, (5.41) 
yields a scalar product on the linear span B 4 := %8/Kern((- | -) 4) of the equivalence 
classes [X ® Y] (see the discussion after Definition 5.3.5). Set K4 equal to the 
closure of 8, with respect to the scalar product (-|-), and let 7, represent B(H) 
on Ky, by 7,(X)[Y 8 ¢] = [XY ® @]. Then, the linear maps V, : K —> Ky and 
vi :>-Ky, rh K, 


Vale) =(1@¢l, VIX @d]= ALXII4), 


define an isometry (A is CPU) and Vi 7(X)Va|@) = VÍIX 8 ] = A[X]| $) for 
all ọ € H. 


Example 5.2.8 That CPU maps are contractions (see Proposition 5.2.2) comes eas- 
ily from Stinespring dilation. In fact, V4 is an isometry, thus V4 vi < Il, whence 


A[X*X] = Vim alX* ralX1Va > VitalXt1V, VÍTa[X] Va = A[X] A[X]. 


Proposition 5.2.4 (Kraus Representation) A : BCH) + B(R)) is CPU if and only 
if it admits a Kraus representation of the form 


=) 6 XG; (5.42) 
j 


where the Kraus operators G j : K +> H are such that, if infinite, the sum converges 
in the strong-operator topology. 


Proof The Stinespring representation is of the form BCH) ® lg on H® K, for a 
finite dimensional or countably infinite Hilbert Ta K. Given an ONB {| 7 )} in K, 
the isometry V4 : Kr H & K and its adjoint vi.: :H8K r K read 


Vale) =} Gjo) 8lj), Val¥@d) (le) Gh ¥), 
j 


J 


with Gj : K > H and X; G'G; = Ik. 
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Remarks 5.2.7 


1. 


Using (5.38), the composition of CPU maps results in a CPU map. Indeed, let 
An : BUH) > BŒ), Ai[X] =}; G} (j) X Gi2(j), X € BCH), and Ag; : 
(H2) + BH3), Az[Y] = } G}, (k) Y G23(k), Y € BCH), then, 


423 0 An[X] = J GUK) X Gk) , 
J.k 


with new Kraus operators G13(jk) := G12(J)G23(k). 


. Let A : BCH) & B(H) be CPU and T the transposition with respect to a fixed 


ONB in H; while A oT need not be CPU , To A oT surely is; indeed, To Ao 
TEXT] T * wi _. yT 
=>; e [X G;|= X; G! X G*, with TIX] =: X7, X € BH), the 
seer of X and X* := (XÏ)! its EN 


. In full generality, a CPU map A : B(H) > B(K) has the form 


=) Cj LİXLj, 
ij 


with yi; Cij LİL; = Ig and C = [C;j] a positive matrix, from which the diag- 
onal Kraus representation is achieved by diagonalization as in Example 5.2.7. 


. While the structure of completely positive maps is fully under control, it is not so 


for positive maps which are still somewhat elusive. For instance, if the coefficients 
matrix C = [C;;] is Hermitean but not positive, by grouping together its positive 
and negative eigenvalues, cg, A can always be written as the difference of two CP 
maps, 


ALIX] = X ce GLX Ge — Y len GE X Ge. (5.43) 


ckZ20 ck<0 
For instance, let { F; pé be a Hilbert-Schmidt ONB in My (C) with Fı = I N/N; 
using (5.32), the reduction map in p 5.2.7.2, which is positive, but not CP 
1 
, reads A[X] = a 1)x + Sale, 


=2 
If there are no negative cg, then A is completely positive; if not, no general rule 
exists to deduce from the cg whether A is a positive map. For particular sufficient 
conditions see Propositions 6.2.2 and 6.2.3 in Chap. 6. 


5.2.2.1 Conditional Expectations 
Particularly important CPU maps are the so-called conditional expectations which 
are the non-commutative counterparts of the Radon-Nikodym derivative in (2.51). 
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Definition 5.2.5 ([140]) A positive, unital linear map E: Ate B C A where A 
and B are C* algebras with identity is a conditional expectation of A onto B if 
[AB] = E[A]B forall A € Aand B € B. 


Proposition 5.2.5 Conditional expectations enjoy the following properties: 


A]' =E[AT] YAEA (5.44) 
[BA] = BE[A] VAEA, BEB (5.45) 
foE=E (5.46) 
A’ A] > E[A]'E[A] WAeA (5.47) 
lE = 1. (5.48) 


Further, E is a CPU map. 


Proof Property (5.44) comes from positivity as in Example 5.2.6.1; Property (5.45) 
is a consequence of (5.44): 


E[BA] = E[(BA)']' = E[A’B"]' = ([A']B')t = BELA]. 


Property (5.46) follows from E[IE[A]] = E[A] for all A € A; in order to prove 
Property (5.47) consider A — E[A] and use positivity and (5.45), then 


0 < E[(A — E[A])'(A — E[A])] = E[A' A] — E[A]' ELA]. 


Property (5.48) results from the previous property and positivity: 


At A < 1A]? = E[A]' ELA] < E[At A] < |All’. 


Complete positivity is a consequence of (5.45) which yields 


(1B, A Bo] = B,E[A]Bo VBi2EB, AEA. 


Therefore, from Examples 5.2.3.7 and 8, 


N 
XO BÌ EIA] Aj]Bj = J. E(B} A} A;Bj]=E[Z* Z]>0, 
= 


N 
j=l i,j=1 


for all B; € B, Aj € Aand N € N, where Z := Xi A;Bj. 


Because of the properties 3 and 5, conditional expectations are also called pro- 
jections of norm one. 
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Remark 5.2.8 In case A is a von Neumann algebra and Ag C A a von Neumann 
subalgebra, one call conditional expectations all projections of norm one which are 
also normal. This latter property of linear maps A : A; > Az between von Neumann 
algebras amounts to the following [80]. Let {A,,},, be an increasing net of operators 
in A, that is a set of operators indexed by a set of indexes u € M equipped with a 
partial ordering < such that 4) < y2 = > Ay, < Ap. If the net {A,}, has an upper 
bound, then it has a least upper bound A € A to which the net converges strongly: 
s —lim, A, = A. Then, A is normal if for all nets {A,,},, with an upper bound 
lim, A[A,] = A[lim, Ap]. 


Examples 5.2.9 


1. Let {Pi}ier € BCH) be orthogonal projections P; P; = 4;; P; such that Sier P= 
l, then E[X] := Xie , Pi X P; is a conditional expectation from B(H) onto the 
Abelian subalgebra P generated by the P;. Indeed, it is positive, linear and writing 
P > P =} er Pj Pj, it turns out that 


aX P] = J` pj PiXP) Pi =} piPiXP; = E[X]P . 


ijel iel 


2. Consider two finite-level systems A and B described by matrix algebras Mn, (C), 
respectively Mn, (C); let Tra denote the normalized trace map performed with 
respect to party A, namely 


PR 1 
Tra (Xa) = z Ta (Xa) ll, 3 (5.49) 
a 


A linear map from the matrix algebra Mn, (C) & Mn, (C) of the compound system 
A + B onto the subalgebra 114 & Mn, (C) is obtained by defining 


IXA Q Xp] :=Tra(Xs)@Xgp WAEM,,(C) BE Mn, (©) 


on tensor products and then by extending it by linearity and continuity to the 
whole of Mn, (C) ® Mn, (C). Any 0 < X € Mn, (C) ® Mn, (C) can be written as 
pee ea X} ® EF by means of a system of matrix units {ERY a1 in Mn, (C) 
(see Examples 5.2.3.7 and 8). Thus, one verifies that E is a positive linear map; 
indeed, 


(Wa ELI We) = Yo Tea (x4)' x4) wal ef wa) 
i,j=l 


l i 
—Tra((¥a)' Ya) 2 0, 
Na 
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where Y4 := D w} X} with w} the j-th component of | Wg) € C” along 
the ONB associated with the chosen matrix units. By writing the identity matrix 
Ves = ar Dia Ef Q E?, where (ESN S= is a system of matrix units in 
Mn, (C), one shows that E[1 4+8] = 14,8, whence E is unital. Furthermore, 
as any X € Mn, (C) ® Mn, (C) can be written in the form X = }°, X$ ® bee it 
turns out that 


LX Xp] = X Tra(X4) @ X$ Xg = E[X]Xp, 
£ 


whence E is a conditional expectation from M,,(C) @ Mn,(C) onto 14 @ 
Mn, (©). 


5.3 VonNeumann Algebras 


In this section, we consider in detail some techniques proper to von Neumann algebras 
which are C* subalgebras of B(H) that are also closed with respect to the strong and 
weak topologies [80,345,353]. 


Definition 5.3.1 The commutant of a C* algebra A C B(H) is the C* algebra 
B(H) 2 A’ := Í X' € BŒD : [X', X]=0 YX € A}, the bicommutant the C* 


algebra B(H) > A” := [x" € BCH) : [X”, X/]=0VX'e Al. 


Remark 5.3.1 Beside being C* algebras, commutants and bicommutants are also 
closed with respect to both the strong and weak topology. Indeed, if X’ = (s, w) — 
limp—oo X’, with [X/,, X] = 0 for all X € A, then, because of the continuity of the 
scalar product, 


(YIX, Xl) = lim (%|[X,; XI) =0 


for all Y, o € H, whence X’ € A’. Therefore, A’ and A” contain all those operators 
that can be constructed from operators in A via strong-limits and weak-limits. In 
particular, A’ and A” contain the spectral projectors of any of their self-adjoint and 
unitary elements. 


Examples 5.3.1 


1. If A C A’, all its elements commute with each other and A is an Abelian C* 
algebra, maximally Abelian if A = A’. 
2. The center of A is the Abelian C* algebra Z := AN A’. 
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3. Of the commutative algebras of Sect. 2.2.1, the C* algebra of continuous func- 
tions, C (¥), is not maximally Abelian since it is properly contained within the C* 
algebra of essentially bounded functions, Lr (4); the latter is instead maximally 


Abelian [345]. 


4. If A = B(H), only multiples of the identity operator can commute with all 
bounded operators on H, that is BCH)’ = {11}, the trivial algebra. On the contrary, 
{1Y = BCH)” = BŒ). The same is true for the C* algebra of compact operators 
A = Bo (H), A’ = {1} for the identity is the only operator on H which commute 


with all finite-rank ones; however, unlike for BCH), B (H) c 


in infinite dimension. 


BCH) = Boo (H)” 


5. Consider the operator-valued matrix algebra My (A) consisting of N x N matri- 
ces with entries from a C* algebra A C B(H). Let 1y @ A C My(A) be the 
subalgebra whose elements have the form ly @ X, X € A. The request that 
[ix @xX, EN Eje Xi] = 5^ Ej @[X, Xij] = 0 forall X € A with 
Xij € BCH), implies (ly & A)! = My(A’). The bicommutant (1y @ A)” can 
be identified by imposing that, for all X’ € A’ and 1 < k < N, 


N 
Eu 8X, r] = (ee @ X' Xj — Ek XjxX’) =0, 


j=l 


where Y = He Eij ® Xij € My (BCH)). This forces Y to be of the form 
Y = oN, Erk ® Xyp with Xpy € A”. Finally, |E; 91, r] = EB; Xy- 
Xii) = Oforalli, j = 1,2,..., N, yields X;; = X for all i, whence (Il ® A)” = 


Iy 8 A". 


6. Given w € H, let H C H be the closure of the linear span of vectors of the form 
X| Y), X € A, with A C B(H) a C* algebra, and by pit :H |> HY the corre- 
sponding orthogonal projector. Since AHA (e HA, it follows that PA X PA = 


Y v 


X PA for all X € A, whence, by a the adjoint x PA = = = PIXE = 


Y 


Pý = 


Pg xt PA = = (XÏ BE PÅ X for all X € A. Therefore, på € A’ and 


e A" for all y € H. 


Definition 5.3.2 A vector a € H is cyclic for a C* algebra A C 
separating if X| y) = 0 => X = 0 for all X € A. 


B(HD if HY =H, 


Being cyclic and separating are related properties; indeed, suppose 7) € H to be 
cyclic for a C* algebra with identity A C BCH) and X’| w) = 0 for X’ € A’. Then, 
0 = A X'| Y) = X'A| y), whence X’ = 0 for cyclicity of Yy € H amounts to A| Y) 
being dense in H. Vice versa, suppose 7) to be separating for A’, but not cyclic for 
A; then, A’ > 1 — ag Æ 0, but PA] w) =|) since we assumed Il € A, which is 


a contradiction. 


182 5 Quantum Mechanics of Finite Degrees of Freedom 


Lemma 5.3.1 Let A C B(H) be a C* algebra with identity; then w € H is cyclic 
for A if and only if it is separating for A’. 


Cyclicity refers to the possibility that, by acting on some vectors with all the 
operators of a given algebra, one gets a dense subspace whose closure is the whole 
Hilbert space. This is the case with the vector | Il) € Li, (X) in the Koopman-von 
Neumann formulation of classical mechanics (see Example 2.1.1). By acting on | Il) 
with the simple functions, one gets a dense linear span, whose closure is the whole 
of LIX ). The same is true using continuous functions f € C(4) or essentially 
bounded functions g € Lr (¥). 


Differently, if a vector Y € H is not cyclic for A C BCH), then Pe projects onto 


a proper A-invariant subspace Há C H. The absence of proper invariant subspaces 
with respect to A is related to the triviality of the commutant A’. 


Definition 5.3.3 (Irreducible Algebras) A C* algebra A C B(H) is irreducible if 
only all 0 Æ y € Hare cyclic for it. 


Lemma 5.3.2 A C* algebra A C B(H) is irreducible if and only if A! = {1}. 


Proof If A’ = {11} then a = Il for all y € H. If all y € H are cyclic for A and 


A’ # {1}, then, according to Remark 5.3.1, there exists a projection A’ > Q Æ Il. 
Therefore, if Q| Y ) =| Y ), then QAJ Y} = A| w); thus, the closure of A| Y ) cannot 


equal H. 


Given the commutant and bicommutant of a C* algebra A C B(H), we can con- 
tinue and consider the commutant of the bicommutant and so on. Notice that, if 
A C B are two C* algebras acting on H, then B’ C A’. Thus, from A C A” it fol- 
lows that A” C A’; however, A’ C (A’)” = A”, whence 


ACA =A aM oer, A [= A [= A SA" ass, 650) 


The closure properties of commutants and bicommutants discussed in 
Remark 5.3.1 are typical of 


Definition 5.3.4 (von Neumann Algebras) A C* algebra A C B(H) is called a von 
Neumann algebra if A = A”. In the following, we shall employ the symbol M to 
denote von Neumann algebras, while keeping A for C* algebras which are not von 
Neumann algebras. von Neumann algebras M with center (see Example 5.3.1.2) 
Z = M A M’ = {All} consisting of multiples of the identity are called factors. 
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Theorem 5.3.1 (von Neumann Bicommutant Theorem) A C* algebra A C BCH) 
with identity ll, is a von Neumann algebra if and only if it is strongly and weakly 
closed. 


Proof As the bicommutant A” is the commutant of the commutant it is strongly and 
weakly closed. Let A”, A” denote the weak and strong closures of A; the strong 
topology is finer than the weak one, thus A* C A” C A” (see Remark 5.1.1.3). 
Therefore, we need only show that if A = A'S then A = A”; in other words, we have 
to prove that A is strongly dense in A”, namely that in any strong neighborhood 
U(X"), X” € A”, of the form (5.7), there is an X € A. In one to do so, vy a first 
step, note that, according to Example 5.3.1.6, given y € H, pe € A’ and p! lw) = 


|Y) for Il € A; thus, PAA" | y= A" Ph py=A pyc HA, this implies that 
for any £ > 0 and X” € A” there exists X € A such that ||(X — X”)| ~)|| < £. The 
second step is to extend this result to generic strong neighborhoods; for this we 
use (5.15), Example aalo and the previous arguments with Il, & A, M,(A’), 
(ln 8 A)” = 1, @ A” and yY replacing A, A’, A”, respectively y. Then, for any £ > 

0, X” € A” andy = X1 |i) ® | yy), there exists X € A such that )77_, ||(X” — 
2 vi dll Se. 


Examples 5.3.2 


1. The previous proof shows that by considering the strong closure of a C* algebra 
A C BCH) one obtains the bicommutant A” c B(H). 

2. Since M C M’ => Z = M, Abelian von Neumann algebras can be factors 
only if trivial. 

3. In the Koopman-von Neumann formalism, C(¥) is a C* but not a von Neu- 
mann algebra; actually C(¥)” = La (¥) for the algebra of essentially bounded 


functions on L? (£) is strongly closed by construction. 

4. If M is an irreducible von Neumann Algebra (see Definition 5.3.3), then it is 
a factor, whereas the opposite is not true in general. However, if M contains 
a maximally Abelian von Neumann algebra VV = N’ C M, then M’ C N and 
Z = M’, whence, if M is a factor it is also irreducible (see Lemma 5.3.2). 

5. BCH) is a von Neumann algebra since any bounded operator can be constructed 
by closing the x algebra of finite-rank operators on H in either the weak or strong 
topology. Their uniform closure instead yields the C* algebra Boo (H) of compact 
operators which is not a von Neumann algebra for B.o(H)” = BH) > B H), 
in infinite dimension. 

6. Let M C BCH) be a von Neumann algebra, E € M an orthogonal projector and 
consider E M E; this is a von Neumann algebra acting on EH with commutant 
(EM EY = EM E(= EM = M' E). Indeed, for all X € M and X’ € M’, 


(EX E)(EX'E) = EXEX' = X'EXE =(EX'E)(EXE), 
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thus E M E C (E M’ EY. On the other hand, if X = XE e (E M' EY it com- 
mutes with E and, for any X’ € M’, 


XX' = XEX' = XEX'E = EX'EX =X'X, 


whence (EM'EY C EME. 
7. Suppose M c BCH) is a factor von Neumann algebra and consider the alge- 
bra MUM! consisting of operators of the form >?; X jx’ with X; €e M 


and x’ e M’. Then, (M U M'Y = M' NM" =M AM' = {All}, whence 
(M U M’)” = BH). 


5.3.1 States and GNS Representation 


The C* algebras so far considered had concrete representations by means of bounded 
operators on given Hilbert spaces H; more in general, the notion of C* algebra in 
Definition 5.2.1 can be formulated in purely algebraic terms. What one needs is the 
abstract setting at the beginning of Sect. 5.2 and an abstract definition of states, along 
the lines developed for classical systems in Sect. 2.2.1. 


Definition 5.3.5 A state on a C* algebra A is any positive, normalized linear map 
w: Ar C; namely, w(Y +Y) > Oforall Y € Aandw(1) = 1. States are also known 
as expectation functionals. 

A state w is pure on A if the only positive, not necessarily normalized, functionals 
u: At C such that u < w have the form u = Aw for some 0 < A < 1. 

States w such that w(X'X) = 0 => 0 = X € A, are called faithful. 


From positivity it follows that states are automatically continuous functionals; 
indeed, if w is a state on A, then 


(w(XTY))* = w(YTX), [w(XTY)? <w(XTX)wW(Y'Y). (5.51) 
In order to prove this, one chooses À € C and consider 
0 < w((X + AY)(X +. AY)) = w(XTX) + AW(XTY) + MW(PTX) + [APwPTY) . 


Then, the equality comes from setting \ = 1, i, while the inequality from choosing 
AÀ equal to the conjugate of the phase of w(XTY). 

The bilinear map A x A > (X, Y) + w(X'Y) would be a scalar product on A 
as a linear space, were it not for the fact that, in general, w(X"X) = Oevenif X Æ 0. 


In order to circumvent this difficulty, one considers the set Z := [x EA: w(x 
X)= o}. Because of (5.51), Z is a linear set and also close; therefore, one can 


consider the quotient A/Z consisting of the equivalence classes [X], := {x +1: 
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Ie 1}, X € A. Since (5.51) gives w((X + 1)" (Y + h)) = w(X'Y) for all 712 € 
T, each class can be identified with a vector | Wy’), Z corresponding to the null 
vector. It thus follows that (5.51) defines a true scalar product over the linear span 
of these vectors, (Wy | Wy ) := w(X TY). Consequently, by closing the linear span 
with respect to the corresponding norm, one gets a Hilbert space H. 

Further, it is immediate to represent operators X € A as linear operators 7,,(X) 
acting multiplicatively on Hw, 


X ToX), TXIP) =| ey). (5.52) 
Since Z is a two-sided ideal in A, the null vector is mapped into the null vector and 
Tw(X) is a well-defined linear operator on Hw; it is also bounded, for (5.51) and 
X*X < |X|” 1 imply 
ITO YY? = w X XY) < IXP Y) = X UL YY DIP? - 
Further, 7, is a so-called x morphism, that is 


Ty(X') = TX, TXY) =T (XTY) VX,YEA. 


Therefore, 7, represents A as a subalgebra of the bounded operators on Ho: A te 


T(A) C B(HL,)). 


Definition 5.3.6 (Representations) A xhomomorphism (homomorphism for short) 
between two C* algebras A12 is a linear map 7: A; +> Az that preserves the 
algebraic relations and the adjoint operation: 


n(Al) = (Ay)! , T(A1B1) = T(A1)7(B1) V Ai, Bi € AL. 


When a homomorphism is invertible, it is called an isomorphism, automorphism if 
it maps A invertibly onto itself. 

If A; = A and A = B(H), then m gives a representation (r(A), H) as a C* 
(sub)algebra of bounded operators on a Hilbert space H. Two representations of 
(771,2(A), H1,2) of A on two Hilbert spaces H1,2 are equivalent if there exists an 
isometry U : Hı > H3 such that mı (A) = UÝ m (ADU. 


According to Definition 5.3.2, the state | WA ) is cyclic for nu (A) on Hu; in fact, 
Tu(X)| Wi) = | WY ) and the linear span of the vectors of the form | Wy), X € A, 
is dense in Ho, by construction. Also, the expectation associated with w takes the 
form 


w(X) = (WA | m(X) |W), XeA. (5.53) 


We shall set | 2u ) := | W%1); from Definition 5.3.2 it also follows that | 2u ) is 
separating for the commutant 7,,(A)’ C BŒ) 

The previous approach is due to Gelfand, Naimark and Segal and is known as 
GNS construction [80]. 
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Definition 5.3.7 Given the GNS triplet Hu, Tw, Rw), Hw, Tw and 2 will be called 
GNS Hilbert space, GNS representation and GNS vector, respectively. 


Remarks 5.3.2 


1. Any triplet (Hy, my, Ry) with the GNS properties of (Hu, Tw, Rw) is unitar- 
ily equivalent to it. Namely, there exists an isometry U : Hy + HL, such that 
U| 2i) =| 2.) and 7,(X) = Ut a, (X) U for all X € A. Indeed, because of 
(5.53) that holds for both representations, the map U : Hy + He defined by 
Un (X)| 2, ) = Ty(X)| 2u ) is such that 


w(XIY) = (Ru |To(X TY) | Qu) = (R | (XK) UUT Y) |2) 
= (Q,|m(X)' mW) |2) 


on the dense subsets Tu (A)| 2u ) € Hou and my (A)| 2, ) < Hy. Then, U extends 
to an isometry U : Ho + Hp; furthermore, on the dense subset of my (Y)| 2, ), 
Y € A, 


UT (XUT Y| Ry) = UI XOY) Qw) = Um (XY)| Qa) 
= UUT, (XY)| Q,) = m (Xr (Y )| 27), 


that is Umu (X)U = m, (X) for all X € A. 

2. As a x-homomorphism, the GNS representation 7, preserves the C* properties of 
A. Therefore mu (A) is a C* algebra, as well as its commutant 7,,(A)’. The latter 
is also a von Neumann algebra, this need not be true of m (A), but it is certainly 
so of the bicommutant 7,,(A)”, that is of the strong closure of ty (A) on Ho. If 
the center Zy := m™,(A)” N 7,,(A)’ is trivial, that is it consists of the multiples 
of the identity only, then w is called a factor or primary state. 

3. Ifwisastate on A and v isa linear positive functional on it, majorized by w, v < w, 
then also v satisfies a Cauchy-Schwartz inequality as w in 651 |v(X ty J? < 
V(X X)v(YTY) < w(XtX)w(YTY). Consequently, v defines a continuous 
sesquilinear form on Hy x He so that, from Example 5.2.4, v(X İY) = (Qu | | 
ynu (X T' Tu (Y)Ru, with T’ € BH). Further, from 0 < V(X X) < w(X'X), 
for all X € A, one deduces that 0 < T’ < Il. Moreover, T’ € mu (AY; indeed, 


V(XİY Z) = (2, | M(X)'T Tu )To(Z) | Quy) = VY X) Z) 
= (Ru | To (X) To (Y)T M(Z) [Qu , 


whence [T’, 7..(Y)] = 0 for all Y € A since 7,,(A)| 2u ) is dense in Hu. 


4 What matters in the derivation of (5.51) is positivity and not normalization. 
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4. From the previous result, it turns out that m (A) is an irreducible C* algebra (see 
Definition 5.3.3) if and only if w is a pure state (see Definition 5.3.5). In fact, 
according to Lemma 5.3.2, m (A) is irreducible if and only if t (A) = {A1}. 
If mu (AY is trivial then v < w implies T’ = All, hence w is pure. On the other 
hand, if 7,,(A)’ is not trivial, then there exists some Il 4 X’ € 7,,(A)’, so that also 
X’ + (X’)! and its spectral projectors belong to the von Neumann algebra my (A)’. 
Therefore, there must be at least one non-trivial projector P’ € 7,,(A)’; also, 
I — P’ = Q' € Tu (A)', so that one can decompose w into a convex combination 
w = \wp + (1 — Awg’, where À := ( Ro | P’ |.) while 


X œ p(X) := \wp (X) := (Ru | P’mw(X) |Qu) (5.54) 
X +> Dg (X) := (1 — Awg (X) = (Ru | Q'Tu(X) |2) (5.55) 


are positive, normalized linear functionals over A which are both majorized by 
w but are not of the form Aw (compare the analogous argument in the proof of 
Proposition 2.3.8). 

5. The previous point is an example of the convex structure [21] of the space of 
states S(A) on a C* algebra A. In more formal terms [80]: given a C* algebra A 
with identity, the set S(.A) of its states is compact in the w* topology generated 
by the semi-norms S(A) 3 w > Ly(w) = |w(X)|. Moreover, its extremal points 
are the pure states and S(A) is the w* closure of their convex hull. 


5.3.2 C* and von Neumann Abelian Algebras 


Let A be an Abelian unital C* or von Neumann algebra. As discussed in 
Remarks 2.2.2.2,3, the algebra C(.V) of the continuous functions over a compact 
topological space V is a typical example of the first case, while the algebra Lr (¥) 
of the essentially bounded functions is an instance of the second case. In this section 
we shall show that these two cases do in fact exhaust all the possibilities: the main 
technique we shall use is the so-called Gelfand transform. 

All multiplicative functionals x : A > C such that 


x(AB) = x(A)x(B), x(AT)=x(A)*  VA,BEA 


are known as characters and their set will be denoted by V4. 
It turns out that characters are states on A; indeed, 


X(A) = x1 A) = x CID X(A) => XCD = 1, 
while, if A > A > 0 then A = BÏ B (see Remark 5.2.2), whence 
x(A) = x(Bİ B) = |x(B)? = 0. 


Further, A — a, with a € C and A € A, is invertible if there exists B € A such 
that B(A — a) = Il; since, for any x € V4, x(B)((A) — a) = 1, it follows that if 
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A — ais invertible then y(A) 4 a forall y € X4. Therefore, the spectrum of A € A 
contains the values assumed on A by the characters on A: 


Sp(A) 2 {x(A) : X € XA}. (5.56) 
Examples 5.3.3 


1. Let A= C(¥), then Xa consists of the evaluation functionals (Dirac deltas) 
ox(f) = f(x) forall x € X, f EC(X). 

2. Let A = Dy (C) the algebra of all diagonal matrices on C™ with respect to a 
given ONB {|i Ece namely A > A = ej A; Eii, where {£;;} is the associ- 
ated family of matrix units. Then, V4 consists of the maps 


Xj(A) = Tr(AE;;) = (J |A lj} = Aj. 


Indeed, AB = Xici Aj Bj Ej Ejj = ee A; B; Eii implies Xj (AB) = A,B; 
= xj(A) x;(B) for all A, B € A. Notice that the multiplicative property cannot 
be true of any pure state | Y )(w| € My (©); in fact, for |Y) = ali) + Bl 7) 


(Y| AB |Y) = |a AiBi + |B A;B; while 
(Pl Alb) (YIB Ip) = laltAiBi + |8ŻA;B; + lal l6 (4:B; + Aj Bi) . 


3. Characters behave as tracial states over A, namely x(AB) = x(B A). However, 
the only tracial state on My (C) is given by 


r(X) =T X), sothat 7(XY) = T(YX), (5.57) 


for all X, Y € My (©). It thus follows that there cannot be characters on My (C). 
Indeed, 

m E a = 

ii? — iy = N ’ N2’ 

therefore the tracial state can be multiplicative only if N = 1. In order to show that 
the only tracial state on My (C) is 7, let us assume that there exists another state 
w such that w(XY) = w(¥ X) for all X, Y € My(C). Let X = Ej; and Y = Exe; 
because of (5.12), it turns out that 


T(E; )T (Ej) = 


w(Eij Exe) = 6 jkw (Eie) = w(Eg Eij) = dew (Exj) - 


Thus, w(£;;) = Oifi A j andw(£;;) = aforalli = 1,2,..., Nsothatw(ll) = 
1 = qa = 1/N. In conclusion, w acts as 7 on a system of matrix units and must 
thus coincide with it. 
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The set of characters is a subset of the unit ball of the topological dual of A (44 C 
(A*)1); moreover, ¥ 4 coincides with the set of pure states over A. In fact, if w is a pure 
state on A, then in the GNS representation 7,,(A) is irreducible (tu (A) = {A11}), but 
then my (A) = Aa I for all A € A for Abelianness implies mu (A) C m,(A)’. Then, 
w(A) = ( 2u | Tu (A) |$2u ) = Aa and 


w(AB) = (2, | Tu (A)Tu(B) |2u) = AAAB = wW(A)w(B) , 


whence w € XA. 

Let A* be endowed with the w*-topology (see Remark 5.1.1.5), then X4 is a w*- 
closed subset of (A*)1. In fact, if Xn € XA w*-converges to x, that is if y,(A) > 
x(A) for all A € A, x is linear and also multiplicative; indeed, 


Ix(AB) — x(A)x(B)| < IX(AB) — xn(AB)| 
+ IAIL [Xa (B) = XB) + IBI xn (A) — xA] - 


Therefore, by choosing n large enough the left hand side of the inequality can be 
made arbitrarily small. 


Remark 5.3.3 Once the topological dual A* is equipped with the w*-topology, its 
unit ball (A*); is compact by the Banach-Alaoglu theorem [385]. As V4 is a closed 
subset, it is also compact [385]; further, since the space of states is Hausdorff, so 
is 4. One can thus consider the C* algebra C(¥.4) of continuous functions over 
the set of characters and the corresponding properties. For instance, in the proof 
of Theorem 5.3.2, we shall profit from a theorem of Stone and Weierstrass [306] 
which states that the norm-closure of any algebra of complex functions on a compact 
Haussdorf space ¥ that separates points and contains the identity coincides with 
C(x). 


Definition 5.3.8 (Gelfand transform) The Gelfand transform is the map F : Ate 
C(£,) from an Abelian C* algebra to the continuous functions over its characters, 
defined by 


AsAPI[Al(X)=xX(A VvrEeXy. 


Notice that [A](x) is automatically continuous on %,4 equipped with the 
w* topology inherited from A*. In full generality, the following property holds 
[80,353,385]. 


Theorem 5.3.2 Any Abelian unital C* algebra A is isomorphic to C(A). 


Proof The Gelfand transform is a *-homomorphism: linearity is evident, also 
PLATO) = x(A") = x(A)* = (PTA]00)*. Moreover, 


P[AB](x) = x(AB) = x(A)x(B) = (TAI TB) O) . 
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It also preserves the norm; in fact, 


ILA]? = sup CLA]OOP = sup |x(A)?? = IAN? , 
XEXA XEXA 


for all A € A. The latter equality is a consequence of the fact that X4 coincides 
with the set of pure states over A and that [80,353], for any A € A one can always 
construct a pure state w such that w(A) = || A|]. 

It thus follows that [A] = 0 only if A = 0. Furthermore, P[I] = 1 and, if 
x1 # xa,then xı (4) = F[A]0) # x2(A) = FLA](2) forsome A € A. One says 
that [A] separates points of 1.4; thus, the theorem of Stone and Weierstrass (see 
Remark 5.3.3) applies to '[.A] so that PLA] = C(¥4). 


Remark 5.3.4 ((385]) If Aisa generic C* algebra and X one of its normal elements 
(X XÏ = X? X), then one can consider the Abelian C* algebra A[X] generated by 
the norm closure of the *-algebra of polynomials in the commuting operators I, 
X and Xİ. Let us consider the Gelfand transform I : A[X] —> C(Xarx]); to any 
function f € C(4X4;x}) one associates a unique element f (X) := ‘ial € A[X]. 
This is known as continuous functional calculus. Consider, for instance, the function 
f(z) = (x(X) — z)7!; then, by using a power series expansion and the isomorphic 
properties of I” and its inverse, one obtains 


1 
r Fed =f@, f(X) = Tf] = 
=% 


X -=z 
whenever z > ||X|| = SUP eX Aix] Ix(X)|. Thus, from Example 5.2.2.5 it follows 
that the spectrum of a normal X € A coincides with the values assumed on X by the 
characters of Vary): Sp(X) = {x(X) : x € Lary}. 


Examples 5.3.4 


1. If A is a finite-dimensional Abelian algebra, then 4 contains finitely many 
points (characters) V4 = {x;}/_,. The maps 3 : XA +> C defined by 3 (vi) = 
dij, are continuous with respect to the discrete topology on Vy. By inverting 
the Gelfand isomorphism, the corresponding elements p; := '~![6;] € A are 
orthogonal projections: 


pi pk = T'O] = dix DOE = Sik Pe - 
These are minimal projections of A; namely, the only projections in A majorized 
by p; are the trivial one p = 0 and p; itself. Indeed, consider two projections q, p 


ina generic unital C* algebra and suppose q < p; then, writing p = q + (p — q), 
Example 5.2.3.2 yields 


q=4Ppqa=9+49(p-Q¢q=4q for p—q=0. 
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Therefore, g(p — q)q = X' X = Owith X = ~p — qq whence X = Oand pg = 
q = 4p. lf p= pi € Aand A > q < pi, write [q] = 77_, Ti (q) ôi with m; (q) 
€ R4; it follows that 


Pq) = lapil = Fla lpi] = ie - 


Since I’[q] = T[q}?, one concludes that I"[q] =ô; whence q = pi. 

2. Consider X > 0, and the function f(t) = /f, t > 0. Since the spectrum of X is 
contained in [0, ||X ||] and thus also Vary] © [0, || X||]; from Remark 5.3.4 it thus 
follows that Y := f[X] = —'[/t] > 0 and Y? = r—![t] = X. We now show 
that the square-root of X is unique; let A > Z > 0 be such that Z? = X and let 
P, (t) be a sequence of polynomials on [0, || X||] converging uniformly to vyt. 
Set Qn (t) := Py (t?): limp—++o0 Qn(t) = t uniformly on [0, || X||]. Furthermore, 
from Remark 5.3.4 and Remark 5.2.1, 


X atx] = Sp(X) = Sp(Z”) = (Sp(Z))* = {t? : t € Sp(Z)}, 
whence 


Z= lim O,(Z)= lim P,(Z?)= lim P,(X)=VX. 

n—> +00 n—> +00 n—>+00 

3. The Gelfand isomorphism maps the von Neumann algebra Bg (X) into C(Y) 
where y is a so-called extremely disconnected Hausdorff space whose open sets 
have open closures [385]. 


The preceding considerations can be extended to Abelian von Neumann algebras 
M c BCH) on a separable Hilbert space H [385]. Since M is also a C* algebra, 
one considers the Gelfand isomorphism I” : M +> C(¥ m); suppose y € H to be 
cyclic for M and define the linear functional F : C(¥%¥m) > C, 


F(f):=(v|C Llp) . 


This functional is positive since IPT! is an isomorphism and preserves positivity; 
then, Riesz representation theorem [305] (see (2.48)) ensures that there exists a 
positive Borel measure u on Xm such that 


(oP) = / nro 
XM 


The support of u is the whole of Xm, otherwise there would exist Y C Xm and a 
positive continuous f, non-zero on V, such that 


wf) =0= (YTF yT) - 
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Since w is cyclic for M, it is separating for M’ and for M C M’; then y PHLF] w) 
= 0 implies P~'[f] = 0 whence f = 0. For all X € M, 


f, de (x) TIXI? = (Y| TTI] y) = Xy. 
M 


Then, one can construct a unitary operator U : Hh K := L? (Xm) by extending 


to the L?-closures of M|w) and C(4y) the linear operator defined on the latter 
spaces by U : X| Y) > T[X]. It turns out that 


U X UÝ (PY) = U X Y| 4) = FIX Y] = TIXI IYD, 


for all X € M, whence U X U' is represented by a multiplication operator on 
C(X\y4). This relation can be used to prove that U M U ¥ is a von Neumann sub- 
algebra of B(K) and since the algebra of multiplication by continuous functions is 
weakly dense in the von Neumann algebra of multiplication operators by functions 
in Li? (Xm) it follows that this latter is isomorphic to U M U i 

Even when a cyclic vector for the von Neumann algebra M does not exist, one 
can prove a similar result [385]. 


Theorem 5.3.3 Every Abelian von Neumann algebra M acting on a separable 
Hilbert space H is isomorphic to some La (X), where X is a compact Hausdorff 
space and n is a finite, positive Borel measure on X supported by the whole of X. 
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The simplest quantum systems are 2-level systems (the qubits of quantum infor- 
mation): their states and observables are 2 x 2 matrices from M2(C) acting on the 
Hilbert space C?. Though simple, the 2 dimensional framework is sufficient to accom- 
modate a variety of rather successful phenomenological descriptions as for spin 
1/2 particles in magnetic contexts [293,323,331], for atoms whose ground and first 
excited states can be treated as isolated from the rest of the energy eigenvalues [107], 
for the polarization degree of freedom of photons [322,369] and for the strangeness 
degree of freedom of neutral K mesons [40,41]. Recently, even macroscopic systems 
in particular ultracold atoms [227] have been started to be studied as spin 1/2 par- 
ticles; this is the case for the low-lying energy states of Bose-Einstein condensates 
in double well potentials and superconducting boxes near resonance [237,374]. The 
latest advances in the experimental manipulation of atomic systems have indeed pro- 
vided concrete realizations of 2-level quantum systems and made them available for 
the actual verification of central issues of quantum information theory [7,79]. 

The observables of 2-level systems are self-adjoint 2 x 2 matrices acting on the 
2-dimensional Hilbert space CÊ; particularly important are the unitary and self- 
adjoint Pauli matrices o1,2,3 that satisfy the algebraic relations 


Ojok = d je lle + LE jKe Oe , (5.58) 
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where € jke is the antisymmetric 3-tensor, and Ilz denotes the 2 x 2 identity matrix. 

When normalized, ©, := o,/ /2, they become an ONB with respect to the 
Hilbert-Schmidt scalar product (5.28), that is Tr(o,o,) = dy. Thus, it turns out 
that any X € M2(C) can be written as (see Example 5.2.5) 


3 


X= > TEX) Sp. (5.59) 
p=0 


It is customary to work within the representation of the eigenvectors of 03, |0) = 
| and |1) = (‘) (the states of a particle with spin 1/2 pointing down, respec- 


tively up along the z direction in space). Then, the Pauli matrices have the standard 
form 
_ (01 _ (0-i _f10 
=u’ PSZ koj? SZ- 
so that o1|0) = |1), o1] 1) = |0) and o2|0) = —i| 1), o2|1) =i|0). 


Example 5.4.1 Using (5.59), any X = XÝ € M>(C) can be recast as 


xo xX, — 1x2 
= ( T ) > X0,1,2,3 ER. 
xi +1X2  —X3 


Its eigenvalues are x+ = xo + ||x||, where x = (x1, x2, x3) € R3; then, 


IXI = [xo + Mell] + lxo = Ixl] - 


Thus, one distinguishes the following cases: 


xo = —|xo| & |xol = lx = Xl = 2 |xol 
xo = —|xo| & |xol < lx] = Xl = 2 lx 
xo = [xo] & xo = |x = > |X|] = 2x0 
xo = |xo]l & xo < |x] = Xl =2 |x]. 
Thus, if |xo| > ||x]], |X|l1 = 2 |xol, otherwise || X ||; = 2 |x|]. 


The action of c1 on the standard ONB amounts to a spin flip; if 0 and 1 were clas- 
sical spin states encoding bits, then 0; would implement the NOT logical operation: 
Or 1,1 — Oorit i @ 1, where © denotes the binary addition (addition mod 2). 
The ONB associated with a; consists of 


a NO OJA +1)’ E ` 
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Their ONB is unitarily related to the standard one by the Hadamard rotation 


a Se ee ee ae ija 
tes- (a-st; E ae (5.60) 


A system consisting of n spins 1/2 is described by the matrix algebra Man (C) = 
(M>(C))®"; denoting by ø$, , u = 0, 1, 2, 3, the Pauli matrices of the i-th spin, the 


w 
elements of Mz» (C) are linear combinations of operators of the form & 


HER a acting 


on (C?)®", The so-called computational basis of quantum information consists of 
tensor products of eigenvectors of 03, 


|i) = [iiz in) =li) Oli) lin), ij € {0,1}, 


that are in one-to-one correspondence with bit-strings i” € a" 

One may interpret them as orthogonal configurations of quantum spins located 
at the integer sites 0 < £ < n of an infinite 1-dimensional lattice. Of course, unlike 
for classical spins, in this case linear combinations of these configurations are also 
possible physical states. Thus, a one dimensional array of n spins 1/2 provide a non- 
commutative counterpart to classical spin-chains of finite length n. Interestingly, 
their algebra Man (C) also describes n degrees of freedom satisfying Canonical Anti- 
commutation Relations (CAR). 


Example 5.4.2 (Finite Spin Systems: CAR) From (5.58) it follows that different 
Pauli matrices anticommute, 


oj. or} = 0jOk + oxo; = 26,1. 


aition (01 _ 71-102 _ (00 l 
Set o4 := ~> = (o0 and o— := -z S ¢ a) These matrices fulfil 
the following algebraic relations 
[ox. o} =1, [os o] = 03 (5.61) 
o4 =0, |o, o+ | = 204, |o, o] =-20_. (5.62) 


With |0), | 1) the eigenvectors of o3, a_| 1) = 0, while o4| 1) =|0). 

In the case of n-spin 1/2 systems, let o} = 03, o4 , Il; denote the spin operators 
relative to the i-th spin. They are identified as elements of Mz» (C) by embedding 
them as 


Mp(C) > of > Mp i—1) 8 of Q lit, (5.63) 
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where lli; := 1; 8 1j41 ®--- lz is the tensor product of as many identity matri- 
ces Il € M2(C) as the sites in the subset [ 7, k]. Then, setting 


di := ( oi) @o! Q lu+,N] > aj = (® oi) & o$, & Wisin. 


i—1 i—1 
j=l j=l 


one obtains operators that obey the CAR of n Fermionic degrees of freedom: 


{ai ; al) = aja; + a ai = Oj. fai. aj} = {aj . aj] =0. (5.64) 
Further, the state vector |1)®” :=|1)@|1)@---|1) consisting of n spins all 
a 
n times 


pointing down, behaves as the vacuum state for it is annihilated by all a;, while aj, 
acting on it, creates the state vector with the i-th spin pointing up, 


aj|1)®"=0, ajj1)®™" =]1 1). 


zaad O Tis 
ee 
ithsite 


There cannot be more than one Fermion for each i as from (5.64) (aly? = 0. Thus, 


products of the form I= (ahi, where i; = 0, | and (ai)° = Il, create the com- 


putational basis, 
n 
(aH 1S = taeda): 
IQ 
j=1 
where © denotes summation mod 2. Since 


[AB, C] = ABC — CAB = A(BC + CB) — (AC + CA)B 


= A{C, B} + {A, CHB, (5.65) 
the number operator 
n n 
Ñ =} ala = D lti- 8 o} 8 li+1,N] 
i=l i=l 
n . 
I; +c 
= >) 1p- 8 ~ 5 3 @ Hii, (5.66) 


[N,a]=-a;, [Ñ, a] =a, (5.67) 


whence N| i™ j= at ij) | i ). Therefore, the basis vectors |i) are occu- 
pation number states that is eigenstates of Ñ; they span the so-called Fock space 
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n 
H = | vac) ® QH, where | vac) = | 1)®” is the vacuum state and Hg is the 
Hilbert space Sgeres ponding to k Fermi degrees of freedom. 

As much as one can construct n Fermi creation and annihilation operators sat- 
isfying the CAR out of n spin 1/2 operators, so one can obtain the n spin algebra 
Mo» (C) out of the creation and annihilation operators of n Fermi degrees of freedom. 
Indeed, from (5.66) one derives o$ = 2a} ai — ll and 


i—l 


i-1 
cl = ([]eaja =j- D) di , o$, = (T] aja; — D) aj ; 
j=l 


j=l 


These relations are known as Jordan-Wigner transformations [347]. They show that 
the algebra of n Fermi degrees of freedom is isomorphic to that of n spins 1/2, 
Man (C). It is important to notice that one Fermi degree of freedom correspond to a 
totally delocalized spin operator. 


Remark 5.4.1 ((342,343]) The kinematical description of n Fermionic degrees of 
freedom is abstractly provided by a set of n operators aj, a; satisfying the CAR (5.64) 


together with the algebra Ap comprising all polynomials P (aj, aĵ) constructed with 


them. The Fock one is a concrete representation of the aj, a‘ as annihilation and 
creation operators on a Hilbert space with a distinguished vector | vac), the vacuum 
state, such that a;| vac) = 0 forall j = 1,2,...,n. 

Because of the CAR relations, in any representation 7 on a Hilbert space H, 7 (aj) 


and Tr(a?) are bounded operators with respect to uniform norm: 


m(a\aj) + m(aja') = 1s m(a\aj) = lira) <1. (5.68) 


Also, any two irreducible representations (71,2 (AF), H1 ,2) of the CAR of finitely 
many Fermions are unitarily equivalent to the Fock one (see Definition 5.3.6). Indeed, 
from the previous example, we know that both representations are isomorphic to 
Mp»(C) for n Fermions; so, the positive operators 7; (N) — Via! Ti (a;)tnlaj) 
have discrete integer spectrum with an eigenvalue 0. In fact, if 7; (N )| yà Y= Al pà i 
then (5.67) implies 


[N , a?] = ai [N , a?—'] + [N , aiat! 
=al, a — a? = —na?, 


so that 


mi(Nymi(aiy" |p) = Ami(aiy" |p) + [iN mai)" |v) 
= (A= n)mi(ai)"| Up). 
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Thus the spectrum is discrete with 0 as its smallest eigenvalue; the corresponding 
eigenvector | y? ) is annihilated by all 7; (a j) and is unique as implied by the assumed 
irreducibility of the representation 7;. In fact, the linear span m; (AF)| y? ) is dense 
in H; (see Definition 5.3.3); therefore, if 7;(a;)| ġi ) = 0 for all j = 1,2,...,n, 
then the same should hold for its component | ot ) orthogonal to | y? ); therefore, 
(op | wi(P(aj, al) IY?) = 0 for all polynomials in annihilation and creator oper- 
ators, whence | ¢t) = 0 as it would be orthogonal to a dense subset in H;. The 
eigenvector | y? ) is the vacuum for the Fock representation 7;; let U : Hı > Hz be 
such that 


UY?) =|v2), Um (Paj, at) y?) = TP (aj, a)y) . 


Since the scalar products (Y? | m; (P’) tm; (P”) |Y? ), with P’ and P” arbitrary poly- 
nomials, are completely determined by the CAR relations, their values do not depend 
on the representation chosen; that is 


(ap? ITI (P’) hr CP WP) = (Y9 TPA TP 9) 
= (4? |m (P UY Um (P |?) . 


on a dense set, whence U extends to an isometry from H to Hp. 


Of course, not all quantum systems with finite degrees of freedom are finite level 
systems. A free quantum particle in one dimension or a one-dimensional quantum 
harmonic oscillator are systems with one degree of freedom, but they are described 
by means of the infinite dimensional Hilbert space of square-summable complex 
functions over R. In quantum information, these systems are sometimes referred 
to as continuous variable systems in contrast to spin-like systems whose variables 
(observables) are instead discrete (N x N matrices). 

For continuous variable systems the standard kinematics is more appropriately 
given in terms of unitary groups of translations in position and momentum. The 
algebraic relations between them are known as Canonical Commutation Relations 
(CCR). 

Consider a Hamiltonian classical system with f degrees of freedom and canonical 
coordinates R?” > (q, p), q = (q1, q2, ..-, df), P = (Pı, p2,..., pf). In standard 
quantization, one introduces unbounded, densely defined, self-adjoint position and 
momentum operators (qi, pi) on H = Lig (R/‘) defined, in the so-called position 
representation, by 


TYD =V), AHA) = -i data), YEH. (5.69) 


On a common dense domain, they satisfy the standard commutation relations (with 
h= 1) 


[a , P] Si [a , T| = [a >; | =0. (5.70) 
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Unlike, Fermionic annihilation and creation operators, the operators g; and pj 
have continuous spectrum and cannot be bounded. Indeed, from (5.70), 


[a R] = in ae! > aI A= n, (5.71) 


for all integer n. Because of unboundedness, one introduces the one-parameter groups 
of unitary operators {U;(q)}ger and {Vi(p)} per, 


(Vi@w)@=vata), (Viirs)@ = 1yo, (5.72) 


where (q;); = 6;j q, in position representation, while 


(Uo) =e vp), (VP) = d= 6B) 


in momentum representation, with (p;)j; = ôij p- 
These semi-groups are continuous with respect to the strong-operator topology, 
whence, by Stone theorem [353], they are generated by self-adjoint operators g; and 


Pi 
Ui(q) := exp (iq pi), Vi(p) := exp (i p@i) . (5.74) 


By writing them as formal series, it can be checked that they implement transla- 
tions in position, respectively momentum: 


VDRU =Gt+a, Vip) RU; P = À- p. (5.75) 
The CCR can thus be recast as 


Ui(g) Vip) = Vj (p) UIQ) i AJ 


(5.76) 
Ui (q) Vi(p) = è 7 Vi(p) Uila) 
Set F := (G1, 92, ---. 7), P = (Pi, P2,---, Pf), 7 = @, P), and 
Wir) := e PHPD) — eir (£F) : (5.77) 
O ly ; : 
where - denotes the usual scalar product and X; := 1; 0 with Il ¢ isthe f x f 
f 


identity matrix. These operators are known as Weyl operators; by the Campbell- 
Hausdorff formula, 


1 
exp (A + B) = exp (zl, B]) exp (A) exp (B), (5.78) 
that holds when [A , B] is a multiple of the identity, they can be recast in the form 


Wr) = e720? eitP gi PA (5.79) 
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The Wey] operators satisfy W(r)' = W(—r) and the composition law 


Wr) Wro) = e2012 Wor, +12), (5.80) 


where 
; 0 If 
ao(ri, r2) := q1: P2- Pi q2 =r: Jprr), I= i 0o)’ (5.81) 


is the symplectic form characteristic of the Weyl relations. It thus follows that the * 
algebra W generated by linear combinations and products of Weyl operators coin- 
cides with their linear span. 


Remark 5.4.2 Given the x algebra W, one looks for its closure with respect to a 
suitable topology; it turns out that the C* algebra that arises from the uniform norm 
is too small for physical purposes [353]. For instance, one would like that two Wey] 
operators W (r1,2) be close to each other when ||r; — r2|| — 0; however, whenever 
rı #P2, 

IWED- WEDI = IL- WEWE | = 2, 
since unitary operators have norm 1. 

The fact that the translation groups {U;(q¢)}ger, {Vi(P)} per are strongly contin- 
uous, makes it a natural choice to consider the closure W of W with respect to the 
strong-operator topology. Similarly to the CAR , also for the CCR of finitely many 
degrees of freedom all irreducible (strongly continuous) representations are unitarily 


equivalent. In order to show this, one uses the so-called Weyl-transform of a function 
fe Ll, (R7/) [173,353], 


f= WəaW(f):= fo f(r) W(-r). (5.82) 


The Wey] transform is such that W'(f) = W(f*), where ft(r) := f*(—r) and, 
by using (5.80), W (f1)W (f2) = W (fi x f2) where 


(fi x fxr) := [ow fiw) far — w) e277) i 
Let P := W(g) where g(r) := (/2m)-f exp(—4lir P), then, 


P=Pt, PWE)P =e p, 


whence P is a projection for P Æ 0. Indeed, if W (f) = 0, choose w, € H such 
that Iy, := (Y | P |ġ) # 0; then, for all r € R*/, 


0= (Y| PWŻEW(F)W(r)P |) = [ow fw) ei?" y| PW(w) P |o) 


= Tyg [ow fw) en allwll? eir rw) 
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Since the integral is the Fourier transform of f(w) exp(—4||wl|°), it follows that 
f(r) = 0 almost everywhere, which is not true of g(r). 

Let K c H denote the subspace projected out by P and consider an ONB {¢,} in 
K; since 


(W(r1)ba|W(r2)¢6) = (Peal W'r1)Wr2) Poo) 
= (bal dp) e27727) em alli—r2I? (5.83) 


the closures K4 of the linear spans of vectors of the form W(r)| ġa), r € R2/, are 
mutually orthogonal. Each vector in K, is cyclic for W which is then irreducibly 
represented on it. Further, K = H; in fact, the orthogonal complement K] is also 
invariant under W. Restricting W onto K, yields another representation, W L, such 
that the maps f œ> W1 (f) := W(f)IK1 are injective. But this contradicts the fact 
that W, (9) = PK =0. 

Therefore, every strongly continuous representation of the CCR decomposes into 
an orthogonal sum of irreducible representations. Further, the relation 


PW(r)| da) = PW(r)P| ha) =e74!"""| da) 


extends linearly to the whole of K,, whence P acts as a multiple of the identity 
on each of the invariant subspaces K4. Consider any two irreducible representa- 
tions Wa. b with their orthogonal projections Pa, = Wa,b (g) onto the cyclic vectors 
Pa.b| a,b ) = | Qa,b ) and their representation Hilbert spaces Ky». By linear exten- 
sion, define the operator Uap : Ka t Kg such that Ugp Wa(r)| Qa ) := Wi (r)| 0), 
for all r € RS; from (5.83) 


(Wari )dba | Walr2)ba ) = ( Wo(r1) dbo | Wo(12)¢0 ) 
= (W(r1)Qa |U} Uab |W(r2) da). 


This relation extends to K4,b, whence Uap is an isometry such that 
U}, Wa(r) Uap = W Yr e R 
ab a(r) ab = b(F), re ’ 
so that the two irreducible representations W,, are unitarily equivalent. 


As we shall see in Chap. 7, for infinitely many degrees of freedom there are 
inequivalent irreducible representations of the CCR ; this is true also for finitely many 
degrees of freedom when the symplectic manifold is not R7/ , rather a torus [343,353] 
or when strong continuity is relaxed. This is the case for a discrete formulation of 
the Weyl relations that holds for discrete variable quantum systems and turns out 
to be extremely useful for quantizing hyperbolic dynamics of the kind studied in 
Example 2.1.3. 
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Example 5.4.3 (Weyl Relations: Finite Dimension) Actions similar to translations 
in position and momentum can also be defined for finite level systems, that is on 


Hilbert space H = C™. Let {| k )}} Cy} be an ONB, fix ay, € [0, 1] and consider the 
following matrices Uy, Vy € My (C) 


ei N-1 
Uy := eN i êu X eXik kyk], Vy =e i% D |k){k-1], 
k=0 k=0 


together with the identification | j ) = | j mod N ). These operators are unitary and 
Unley =e Oe), Vyley=eWi etl), (5.84) 

Thus, setting n := (n1, n2) € Z?, Uy and Vy satisfy the discrete Weyl relations 
Ut! y% = eÑ imm yryn | (5.85) 


Further, like in the continuous case, it is convenient to introduce the discrete Weyl 
operators 


Wy (n) := ei R "m yM yhe, (5.86) 
They satisfy wi (n) = W(—n) and the composition law 
Wy (n)Wy (m) = è X90 Wy(n +m), (5.87) 
with o(n, m) := nım — nm. Since || [Uyn , Vy] || = 2| sin Kh letting N —> oo 
one expects to recover the commutative structure of Example 2.1.3. 


In order to be compatible with a finite dimensional Hilbert space C™, powers as 
U N and Ve must be proportional to the N x N identity matrix I y; in particular, 


UN =e gy, V =i% ly. (5.88) 
Different choices of a, label different irreducible representations WR”; they play 
a role in the quantization of classical discrete maps as in Example 2.1.3 [119] (see 
Example 5.6.1). These representations cannot be equivalent, otherwise, for a, 4 a’,, 
there would exist an isometry T such that 


+77N — j2riay — rN _ arid 
T Una, T =e t= Uy yp =e ae 


where Uya denotes the operator Uy fulfilling a specific rule (5.88). 
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When normalized, the discrete Weyl operators form an ONB in My (C). Indeed, 
using (5.86) and (5.84), it turns out that 


=I 
Tr(Wwy(n)) = D> ef (e UR VR? Ie) 


€=0 
N-1 
_ enw! p n2—2nı (ay +£+n2) 2n20)/ £| £+-n7) 
£=0 
ie ee 
= ðmo Doe wrt =N dno. (5.89) 
£=0 
This in turn yields 
Te( Wy (2) Ww (m)) = N ònm , (5.90) 
whence (see Example 5.2.5) 
| i 
X=5 3 (Te(wi x)) Wv(n) VWXEMN(C), (5.91) 
ne N 


where Z4, := {n = (nj,n2):0<n; <N- 1}. 


Returning to continuous variable systems, a particular state vector in H = L? (RÔ) 
is given by the Gaussian function 


; 2 
g(q) = m exp- Z) . @+i®)g)=0, (5.92) 


foralli = 1,2,..., f. Notice that in momentum representation g(p), obtained from 
g(q) by Fourier transform, has the same Gaussian form as the latter. For reasons which 
will become immediately clear we shall refer to the Gaussian state as to the vacuum 
and set | vac) := |g). 


Example 5.4.4 (CCR : Annihilation and Creation Operators) Given f canonical 
pairs (ĝi, Pi), using (5.70) one shows that the operators 


Gti + -if 


, 4; = ’ 
v2 ! V2 


satisfy the CCR that describe Bosonic degrees of freedom, 


(5.93) 


di 


ai, a) | = bij, [ai „a a = [ai aj] =0. (5.94) 
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The Gaussian function | vac ) plays thus the role of the vacuum for the CCR as it is 
annihilated by all aj, a;| vac ) = 0. Since 


aa; (a})" | vac) = allai, (al)"]| vac) = ni (a})™" | vac) , 


the vectors | k ) = | k k sss k ) = | JE (a; y" | vac ) are such that 
= 1,42,.---, f = 
a sN ki! : 


ailk) = ykilk— 1i), allk)= yki +lilk+ 1i), (5.95) 


where k + 1; := (ki, ..., ki—1, ki £1, kj41,...,k 7). They are the orthonormal 
eigenvectors of the number operator, 


f f 
N:=X ala, Nik) =(Q klk). (5.96) 


i=l i=l 


The occupation number states |k ) span the Fock space for the f Bosonic modes 
f 
(degrees of freedom), HY ) = | vac) ® QH, where H, is the Hilbert space of 
n=1 
n modes. Unlike for Fermions, the number operator is unbounded and the Bosonic 
Fock space is infinite dimensional for such is the Hilbert space of each mode. 
By introducing the 2 f-dimensional operator valued vectors 


ASS (Gi. pO ins) ANE it a est (5.97) 
the Weyl operators (5.77) can be rewritten as 


pE iaa Hpi ai 4j-!Pj 


: 
Wr) = ef = = where (5.98) 


ech, (5.99) 


while the CCR relations (5.80) become 


1 
eZ e754 — go ZF (232 )) o(Z{+Z5)A. 


where 43 = G 0 


0 a) and (5.99) yields 


Zi -(43Z2) = —2i3(z7 -22) = —2i o (r1, r2). (5.100) 
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Of particular interest are the so-called displacement and squeezing operators. The 
former ones are defined by 


zl? 


D(z) = ETEA n eT Ge eia (5.101) 


where z := = {z} 1€ Cf and (5.78) has been used. Their action is as follows, 


CO 
1 di : 
D(z)! aj D(z) = doe [laj] =a; tz, f=l,2,...,n, (5.102) 


where di denotes the map d; [-] = [—zj at + zi aj , -] applied k; times. 

Given z € C/ and the corresponding displacement operator D(z), using (5.98) 
and (5.99), one finds that it corresponds to the Weyl operator W(r(z)) with r(z) = 
J/2(—R(z), 3(z)), whence, via (5.80), one computes 


D(z1) D(a) = ec 2841322) Dy +z). (5.103) 
The squeezing operators are instead given by 


Sn) = exp (2 @'y= Ta). nec’. (5.104) 


By means of the commutators 
d,,laj] =[—nj (af)? + nga}, aj1= n a} , 
d;la}] = (—nj (al)? + nat, at] = n¥ aj, 
with 7; = elbi |7;|, one computes 
aj(nj) := S*(n) a; S) = cosh(|nj|) aj; + ef?! sinh(\7j|) a} (5.105) 


Notice that the quadratic structure of the exponent of S(77) sends annihilation and 
creation operators into sums of annihilation and creation operators 


At SMA, (5.106) 


with S(7) a complex 2f x 2 f matrix whose entries are given by (5.105). 
Going back from (5.93) to canonical operators 


S) 
| 
Si 

D) 
< 
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one finds 


Gi (M := SIMT S) = (cosh(Injl) + cos(o;) sinh(Injl)) F 
+ sin(¢j) sinh(|nj|) P} , 

Pi) := S1) PS) = sin(g;) sinh(\nj|) Pj 
+ (cosh(|njl) — cos(;) sinh(\nj1)) By 5 


where 
f f * 
N= o I aa cea! 
s) := 02 1 aea Ga Pia) 
j=l j=l 


Notice that, when sin(¢;) = 0 and nj = rj € R, then 
Gi =e4G;, Pim =e" P, (5.107) 


so that the j-th Bosonic contribution to S(7), 
Ti aAa pare 
S(nj) = exp ( -iz Pj + Biai)) 


becomes a so-called dilatation operator. 


5.5 Quantum States 


Hilbert space vectors as those encountered in the previous sections are the sim- 
plest possible instances of quantum states: once it is known that a quantum sys- 
tem is in a physical state described by % € H, then the system observables X = 
Xt € B(H) have mean-values, or expectations, ( X do t= (Y| X |). With BCH) 5 
Py := | Y ) (Y | the orthogonal projector onto |% ), using the trace (5.19), one writes 
(X dw = Tr(Py, X). 

One-dimensional projections Py are known as pure states and are the most infor- 
mative about the system they describe; they are quantum counterparts to the classi- 
cal evaluation functionals ôx (f) = f(x) of Sect. 2.2.1. Also in quantum mechanics, 
however, what is often practically achievable is not the specification of a precise state 
vector, but only that the system physical state corresponds to a projector P; occurring 
with a certain weight 0 < Àj < 1 within a statistical ensemble J of projectors such 
that pay cJ Aj = 1. In such a case, the state of the system is a mixed state, namely 
a mixture of pure states; relatively to them, observables have mean-values that are 
linear convex combinations of pure state mean-values: 


(X) := > Ap (vj |X vj) =TrpX), p= yo Ay Ivy p; l. (5.108) 


jeJ jes 


206 5 Quantum Mechanics of Finite Degrees of Freedom 


As a linear convex combination (weighted sum) of projectors, p is a positive operator 
of trace 1, known as density matrix. 


Definition 5.5.1 (Density Matrices) Any positive trace-class operator p € Bı (H) 
with Trp = 1 describes a mixed state; let p = a 1 rj lrj)(rj| be its spectral 
representation with 1 > rj > 0, ae rj = 1. Then, p defines a positive, linear and 
normalized functional on B(H): 


Œ 5 X e w,(X):= Tr(pX) =J riloj X lój). (5.109) 
j 


The set of all density matrices over the Hilbert space H of a quantum system S will 
be denoted by S(S) or by Bi (H) and called state-space. 

Its extremal points, those which cannot be decomposed into convex combinations 
of other states, are called pure states. 


Remark 5.5.1 The eigenvalue r; of p in (5.109) represents the probability to find 
the system in the state | r; ) once it is known that it is described by the density matrix 
p. However, a mixture as in (5.108) can correspond to a convex combination of 
non-orthogonal projectors P; = | Y; )(w; |; this fact points to two crucial aspects 
that mark a substantial difference with respect to classical phase-space probability 
distributions: (1) a same p corresponds to different mixtures and (2) the weights Àj 
of the mixture are interpretable as probabilities if and only if 


Ak = (Vil lb) = JA by ded? > (dy lve) = Sie, 


jeJ 


that is if and only if the state vector corresponding to a given physical mixture are 
the eigenvectors of the associated density matrix. 


Examples 5.5.1 


1. The geometry of the state-space of two level systems can be simply visualized. 
Using (5.59), the density matrices p € M2 (C) read 


r s 1 1/ 1+ pı—ipm 
= = x~(1 -O)=-= . 11 
P a) zí Tee) co. 1-—p3 )’ ilo} 


with 0 <r < 1 andr(1—r) > |s|? for p > 0. Thus, p € RÌ has length 
0 < lloll? = 1 —4Det(p) < 1. 


This is the so-called Bloch representation of the density matrices of two-level 
systems which identifies them by means of the vector p, known as Bloch vector, 
inside the 3-dimensional sphere. The pure states are uniquely associated with 
points on its surface, while orthogonal states are connected by a diameter. 
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2. By expanding the exponential operators in (5.101) as power series and acting on 
the vacuum state, the resulting pure state, 


k -at kzl = 
z):= D(z)| vac) =e 2 &* |vac) =e 2 J ik}, (5.111 
|z) := D(z)| vac) | vac ) JED m ae 


is a so-called coherent state [369], that is an eigenstate of the annihilation operators 
ailz) = z\z) Vix=l,2,...,n. (5.112) 


It thus follows that the uae wom of the components of z are mean- 
occupation numbers: (z | N IZ) = yi = IZ; |7. Coherent states cannot be orthog- 
onal to each other for there are uncountably many of them, indeed, from (5.103), 
one derives 


(2! |22) = (0| Dt (z!) D(z?) |0) = e2 Era’ =P (5.113) 


Nevertheless, with z; = rj exp (ivj) and dz = [ [}—; rjdrj vj, 
1 f x lpi dai| a os 
IPJ ITJ eT PjTgj 
=f, dz|z){ ie 5 a rn rjdrje IT; x 
aE 5 | py )(py | = 1. (5.114) 
a j FIN Pj 


j=1 pj=0 


Namely, coherent states form an overcomplete set in the Fock space Hi! ). 
Let a" represent the creation operator of a photon in a single mode; a coherent 


o0 n 
—|z|2 Z ; ce ‘ 
state |z) =e lal 72 J -=l n ) corresponds to a Poisson distribution over the 


2 zi” —|zļ? 
mode number states, |(|z)|“ = — ee 


3. Passing from annihilation and creation operators to position and momentum ones, 
by inverting (5.99) the complex parameters z € C/ correspond to points r = 
(q, p) € R? in phase-space, where q := V28 (z) and p := /23(z). Coherent 
states are then characterized by gaussian localization both in q and p; indeed, 
if V2zo = qo + ipo, then from the discussion preceding equation (5.103), using 
(5.79), (5.72) and (5.92) one gets, in position representation, 


ec llg—aoll?/2 
(g Izo) = (g | W((—do, Po)) vac) = ePm (5.115) 
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while, in momentum representation (see (5.73)), 


e7llp—Poll?/2 


— e7!40'(P—Po/2) 
(p|zo) =e a /4 


(5.116) 
The phase-space localization properties of coherent states make them useful tools 
for studying the quasi-classical behavior of quantum states [167]. In particu- 
lar, given a density matrix p for a continuous variable system with f degrees 
of freedom, one can compare its statistical properties with those of the func- 
tion Rp(q, p) := (z | p |z) [353] which is positive, since p > 0, and normalized 
because of (5.114), thence a well-defined phase-space probability density. Vicev- 
ersa, given a phase-space density R(q, p) one can naturally associate to it a density 
matrix, pr, which is diagonal with respect to the overcomplete set of coherent 
States: 


dx d ; ; 

PR = fa R(z)|z)(z| =f y R(x, y)|x +iy)(x +iy|. (5.117) 
Ref (27) 

Most density matrices p admit a so-called P-representation [145] as above in 

terms of a function R(x, y) that is summable and normalized, but, in general, not 

positive and thus not a phase-space density. 


The classical features of coherent states stand in sharp contrast with that of number 
eigenstates: this behavior shows up most evidently when photons interact with beam 
splitters [150]. 


5.5.1 Beam splitters 


A beam splitter is an optical device that is used to divide an incoming classical light 
beam of intensity J along a spatial direction 1 into a reflected beam of intensity R x Z, 
with reflection coefficient R along an orthogonal direction 2, and a transmitted beam 
of intensity T x Z, with transmission coefficient T along the incoming direction 1. 
In absence of absorption and dissipation, R + T = 1. 

Quantum mechanically, one associates photon modes to the two spatial direc- 
tions; in an effective two-dimensional description, a generic single photon state | 7 ) 
incident upon the beam splitter is a superposition of single-photon basis states | | ), 
| 2) describing photons impinging on the beam splitter along the directions 1 and 2 
(Fig. 5.1). 

In absence of dissipation, the interaction with the beam splitter produces outgoing 
photon states according to the rules 


I1) :=U|1)=Al1) r22), 12) :=U]|2) =r] 1) +212), 


where r1,2, t1,2 € C are the reflection and transmission amplitudes along the direc- 
ti r2 


tions 1,2. Therefore, a natural matrix U = i 
1 %2 


) appears which is unitary. In 
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uence 


o~ 
N 


Fig. 5.1 Beam splitter 


fact, by using creation operators aj for the two modes, one writes |i) = a} | vac), 
where the vacuum | vac) is the state with no photons. Setting | 1’) := bi | vac ) and 
|2’) = b| vac ), yields bi = tial + naj, bi = rial + haj and the Hermitean con- 
jugate linear relations. As the bi create and annihilate new photon states, they must 
comply with the CCR (5.94), whence 


[by bY) = al? +r? = 1 = [b2, DY = InP + iral? , fbr, bl = tfr trim =0. 


The whole physical process can thus be characterized by the matrix 


U := tirn) tei% pel? 
~ wib] reiĝi teit] ’ 
where r? + £? = Land (ġ1 + $2) — (Y1 + Y2) = r. For sake of simplicity, we shall 
set t =r = 1/V/2, pı = 42 = 0 and ġ1 = 4, so that ¢2 = 7 — ¢; in this case, the 


matrix U reads 
— fii r2\ _ 1 1 eib 
U= e | = a (ti 1 ; (5.118) 


It describes a so-called 50:50 beam splitter that rotates by ọ and m — ¢ the 
reflected and transmitted beams. The transformation a1, 2 +> b1,2 can be unitarily 
implemented, namely we can explicitly construct the operator U that sends |1), 
|2) into | 1’), |2’). Since the beam splitter does nothing to the vacuum, namely 
Ü| vac) = U| vac ) = | vac), the action of U must be such that bi2 = Tay gt. 
Let us consider the operator 


=~ Tox t a 
U (z) = e5023 ae, z= zle? ; 


it is unitary and its action can be computed as for the displacement operators in 
(5.102). Namely 
; Sg 
Oea 0o = Y aka}, dl] = [zaya} — c* alan, +]. 


kl 2 
k=0 
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Since d,{a}] =z aj and d; [a} ] = =z* al, the infinite sums can be explicitly com- 
puted, the result being 
O(z) a} U(z)' = (cos|z a} — e!® (sin |z|) a4 


O(z) ah D@t =e! (sin|z|) a} + (cos |z|) a} . 


Therefore, the matrix (5.118) corresponds to |z| = 7/4 and a = 7 + @; set Ü := 
U (—7/4 exp (i¢)). In terms of Ū it is now easy to check that an incoming photon 
along direction | emerges in a superposition of states, 


7 idt id 
a ge e 1 PT2 
jiyda = Gal Ova = BEE | yg = (eee) 
V2 J/2 


Instead, for a coherent state | a) = D(a)| vac), a € C, one has 


~ ~ Dai Di-a* Oa, OF 
Ü D(a) U"| vac) =e aUa U'—-a" a vac) 
at a* .-id 


= ev?" RB evi tas Fe i vac) =| 


Ula) 


a g Q 
2 ees 

v2 v2 

This means that an incoming coherent state gets split into a transmitted coherent state 
| A )1 of intensity |a|?/2 and a reflected/phase-shifted coherent state | el- )2 of 


intensity |a|*/2, exactly as with classical light. 

On the other hand, a purely quantum effect results from a beam splitter acting 
on a state | 1; 12) consisting of two photons coming from the orthogonal directions 
1, 2. It is transformed into | 1/15) = b{b}| vac); explicitly, 


11,15) =Tal 00 al O +| vac) = x(a} + e!al)(—e'al_ + al)| vac) 
|21} +122) 
V2 l 


The outgoing state thus consists in a superposition of states with both photons mov- 
ing along a same direction; then, photons will always be found together either along 
direction 1, |21 ), or along direction 2, |22), with the same probability. On the 
contrary, no experiment can reveal one photon along direction 1 and the other pho- 
ton along direction 2. This is because the amplitude for | 1,12) is the sum of the 
amplitudes of all processes leading to this state; in the present case these are reflec- 


1 o 
= 54a]? +a tal — ajaj Tap eit (aly? )| vac) = 


tions along either directions 1 and 2 with amplitudes rır2 = —1/2 and transmissions 
along either directions 1 and 2 with amplitudes tıt2 = 1/2. These processes interfere 
destructively. 


The same kind of effect appears in single photon experiments with Mach-Zender 
interferometers. 

In a configuration as in Fig. 5.2, an incoming photon is either reflected or transmit- 
ted at a beam splitter BS, with amplitudes rı and f1, reflected by perfectly reflect- 
ing mirrors M and then either reflected or transmitted with amplitudes r2 and t2 
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BS; i 
l W 
— 
Tri 
BS» 
t2 
M D D: 
T2 
w 
D2 


Fig.5.2 Mach-Zender interferometer 


BS; 


Dy Dı Dı D2 


Fig. 5.3 Mach-Zender interferometer: binary tree 


at a second beam splitter B S2. The outgoing photons are then counted by detec- 
tors D2. The probability Pı of a photon being detected at Dı is determined by 
the amplitude at Dı which is the sum of those of the processes “reflection at BS, 
+ transmission at BS2” and “transmission at BS; + reflection at B S2”, that is 
Pi = |rita+ tire|?. Analogously, the processes “transmission at BS; + transmis- 
sion at B S2” and “reflection at BS, + reflection at B S2” contribute to the detection 
probability at D2, P2 = |tit2 + rire |. One can visualize the entire process by means 
of a binary tree with one level for each beam splitter (Fig. 5.3). 

If both beam splitters act through a same operator U as before, then, taking into 
account the impinging directions, 


1 = |? 
p= |i eae) ates: 


2 
| 


1 . 
= sin’ ¢, Py = p(i+e) 


An interference pattern thus emerges which depends on the phase-shift ¢ and can 
then be experimentally controlled. 


5.5.2 Uncertainty Relations 


Beside being necessary for the consistency of the statistical interpretation of quantum 
mechanics, the positivity of quantum states is, together with non-commutativity, at 
the origin of the Heisenberg uncertainty relations. 
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Consider the CCR for f degrees of freedom; with the notation of (5.77), let 
pe Bi (H) be a density matrix such that all first moments r; := Tr(p7;) and all 
second moments Tr(p7;7;) are finite. Then, the 2 f x 2f real matrices 


CP = [Thc =r) - aa te = [Tho C; -rji — D 


i,j=l 
are both positive. In fact, 
2f 
(ul CP |u) = Y wuj Teho- r-r) = THX" X) >0, 
i j=l 


for all u € C*/, where X := SL uj; (rj — ri). Using commutators ([-, -]) and anti- 
commutators ({-, -}), one finds 


@G-n)Gj—rj)= AG ris WF; -r;)| + es] 


G-E- A- G-rp| -Ra 


With J as in (5.81), the CCR (5.70) read E ; a = i (J Jij; it thus turns out that 


the correlation matrix 


2 Ceo: 1 
C = hiha A. Tff- G-rpl). 
(5.119) 
beside being positive, must also satisfy 
1 G J 
C’ + sml [e 7) =+ i >0. (5.120) 


Let f = 1 and choose p = | Y ) (4 |; then, 


pat O1\ _ 
ets -10) — 


(YIA? Y) (HIT, P}ly)/2+i/2 


(WIG, PHIY)/2Fi/2 (HIA? Id) 


where AG := 9 — (yY |F Iy ) and Ap := p— (Y | PIY ). Therefore, (5.120) implies 


(VIG, DI)? 11 
4 ti F 


(pl A°Glb) lAa Py) = 


These are the uncertainty relations for conjugate position and momentum. 
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In terms of Bosonic annihilation and creation operators, using (5.93) and (5.97), 
the correlation matrix reads 


} 1 + 2f 
ve =U} CU = 5[T(of Ai = (Ai)p, Al = (Aj) o}) | , (56121) 
2 J J i,j=l 
1 11 eo. # f : 
where U; = Va Lii @ lp and (A¥), := Tr(p At). Notice that the matrices 
VP, 
~ 1 2f 
Pai Pee : tT pat 
Fe = [eA (Ade) (AF C49) a 


and the transposed 


~ 1 2f 
py\T _ L t f o . ; 
Eor = Enola} tap) (a on 
are also positive. Since [ai ai] = 6; if1<i< f, while [Ai a`] = —ô;; if 
1+f<i<2f,and 


l i +_1 Yata 1 i 
540. Ai] = A;A} — zlá. ai] = AÌA; + zlá. ai] , 
it follows that 
1 l; 0 
p E = f 
VP>O, Fe mat: Ba: (% ae) (5.122) 


The correlation matrix of a coherent state p = | z)({z| as in (5.111) is particularly 
simple; indeed, by virtue of a;| z) = z;| Z), it turns out that 


3 
(zlalaj |z} =zřzj, (zļaiaj |Z} = ae? , (zlaia; |z) = ĝij + Ziz} > 


1/1; 0 
PS f 
whence V’? = 5 ( 0 P 


5.5.3 Gaussian States 


In classical probability theory, continuous probability distributions u on R” can be 
described in terms of their Fourier transforms or characteristic functions [189] 


F, (£) := f du(x) ef * 


In this way, the moments of the probability distribution can be obtained by differ- 
entiating F,,(€) at € = 0. Similarly, let p be a state of a continuous variable system 
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with f degrees of freedom equipped with a strongly continuous representation of 
the CCR in the form (5.98), then the characteristic function of p is given by 


FE (r) = Tr(p Wr) = Tr(pe”"4) =: FY), (5.123) 
where (5.98) and (5.99) have been used. 


Remark 5.5.2 The characteristic function F i (r) is the inverse of the Wey]-transform 
(5.82); indeed, 


p= | ST Fr) W(—r) 
(2r)f P , 
where the convergence of the integral is understood with respect to the weak-operator 
topology on the representation Hilbert space. The easiest way to see this is to call X 
the right hand side of the previous equality and calculate its matrix elements in the 
position representation (5.69) using (5.72): 


dr 
ernie J omy yf FE) (qı | W(-r) |q2) 


dq dp 
CI 


dp —ip. 
(Qn)f K (qı — q2, poe 2? 41442) | 


FE) 6 = qi + qe Poe) 


By computing the trace in position representation, one gets 


FS (qi — 42. P) = Te(o Ws — 42, P)) 


= | ax (xipi -gi tan) PET 


2 —x) 
’ 


dp 
Qn)f 


whence, from the representation e!P'4 — §(q) of the Dirac delta, 


dp ipg 
lxi) = fax f SP pein (xi plx—qi +42) = (ai lplaz) - 


Taking derivatives of F - (r) at r = 0 with respect to the real variables qi, pi, 
respectively of F i (z) at z = 0 with respect to the complex variables z;, z7, one 
gets the expectation values of all products of position and momentum coordinates, 
respectively of annihilation and creation operators. 

For instance, the first moments arise as follows, 


dF O Toa, O2F)@|_. = —Teleq}), (5.124) 
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while second moments can be extracted from 
1 , 
A AO =Tr(paja;) = 5Tr(p{4. 4}}) (5.125) 
forol<i<f,l+f<j<2f, 
' 1 + 
Bayol = Tr(paja} ) = The [Ai Ai) (5.126) 
frl+f<i<2f,l1<j<f, 


Õij j 1 : 
2 V 8 at Vie . Al 
Dh FI ol =2- Te(pajaj) = 5Tr(o {Ai Ai) (5.127) 


foonl+f<i<2f,1l+f<j<2f, 


Oij ; 1 
2 Vv = uy D Eea p t 
O24 Fp ol a - Tr(paia}) = Teho fA ; Aj}) (5.128) 
frl <si<sf,l<j<f. 

The link with the correlation matrix (5.121) is apparent; indeed, the previous 
moments arise from a gaussian characteristic function of the form 


FY (2) = e7 A- 320D , (5.129) 


where (A), := {Tr(p Any, and the vectors Z, A are as in (5.99) and (5.97). Notice 
that (5.121) implies that the sesquilinear form 


(Zi, Z2) Zi - (V’Z2) 


is symmetric and positive. 

Taking into account (5.93) and (5.99), one passes from the complex vector 
Z = (z*, —z) € C7! to the vector r = (q, p) of canonical position and momentum 
coordinates by means of 


1 1 -i 
Z=Unr, tr= (4e. 
O ly 
Ir, O 
becomes the following Gaussian function of r € RS, 


Since ULU = —i ù =-i ( ), where U, is the matrix in (5.121), FY (2) 


FE (r) = en CO- r Rn (5.130) 


where (F) := Tro. 

As in classical probability, the Gaussian form of the characteristic function is such 
that higher moments are determined by first and second moments. Obviously, not all 
quantum states have this property, if they do have it, they are called Gaussian states. 
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Examples 5.5.2 


1. Coherent states p=|u)(u| = D(u)|0)(0|D'(u), ue Cf are Gaussian; 
indeed, using (5.102), 


FY (z) = (0| D'w)e”"4 Du) |0) = (0| e7 4+0 10) 


= FU (e241) = eZ 0-4? U = (u, u*) € CP. 


The first moments are u = (u |a |u ), u* = (u |ař |w), the correlation matrix 
yr = 1 ls 0 ; 

2\ 0 1 f 
From its characteristic function, one can easily determine whether a given Gaus- 
sian state p is pure; indeed, by using Remark 5.5.2, one computes 


dr} dro 
(27)?f 


Tr(p?) = FE) FE (ra) Te(W(—r) W(-12)) 


dr e7 (c'r) 1 


(2n)f J4/Det(C?) - 


Therefore, p is pure if and only if Det(C?) = 47%. 
2. By means of the squeezing operators introduced in Example 5.4.4 one can intro- 
duce the so-called squeezed-coherent states: 


d 
= if oe FE (r) FC(—r) = 


[7,Z) := Sn) D(z) |0), (5.131) 


where 


In) := S(7)|0) (5.132) 


is the so-called squeezed vacuum. Squeezed coherent states are Gaussian states; 
indeed, 


FY (n, 2) := (n, z le2"4 In, z) = (0| DE) S*(ne”"'4 S(n) D(@) |0) 
= (0| DÌ (z) eZ S MAS) Dz) |0) . (5.133) 
Then the result follows from the previous example and (5.106) which yields 
* 
Z*. SMA = (SZ) -A, 
where S(7) is the 2f x 2f matrix implementing the squeezing transformation 
(5.105) of annihilation and creation operators. 


Using (5.107), one finds 


(n,z|ĝ; ln, z) =(n,z|P;ln,z)=0 and 
(n,21G; In, z) =e) (0197 0), (n, z| p} In, z) = e7% (0| p° 10). 


5.5 Quantum States 217 


Then, one can decrease below the Heisenberg limit the uncertainty of J, respec- 
tively p, by choosing r; negative, respectively positive, at the price of increasing 
the uncertainty of p, respectively g. 


A part from their first moments which can always be set equal to 0 by a suit- 
able shift operated by a displacement operator D(z) in (5.101), Gaussian states are 
completely determined by their correlation matrix. An interesting question is the 
following one: given a Gaussian function 


F(z) = eM 2D | (5.134) 


with M € C’/ an assigned complex vector and V an assigned (2 f) x (2f) positive 
matrix such that the associated sesquilinear form is symmetric, 


Z* .(VZ2) = Zš- (VZ1), (5.135) 


is F(z) the characteristic function of Gaussian state p with correlation matrix V and 
first moments given by the components of M? 

The answer is that it is so if and only if V satisfies the conditions (5.122). While 
necessity descends from the uncertainty relations, sufficiency comes instead from 
the following general result [174]. 


Proposition 5.5.1 A function R? > r œ> FC (r) (C?F 3 z > FY (z)) is the char- 
acteristic function of a quantum state p of f degrees of freedom satisfying the CCR 
if and only if 1) FC (0) = 1 (FV (0) = 1), 2) FC (r) (FY (z)) is continuous at r = 0 
(z = 0) and 3) for any n-tuple {ri}; _p Ti € R”, ({zi iy) the n x n matrix FE 
(FY ) with entries (see (5.100)) 


is positive definite. 
We postpone the proof of the proposition and instead show that if the positive 
(2f) x (2f) matrix V in (5.134) satisfies V + ly, > 0, then there is a (Gaussian) 


state p with F(z) as characteristic function. 
We just need to consider condition 3) and prove that 


n 
*_ Ok). _1(7*_7*), __7%. 1 k 
3 užu; eZ ZDM e71Z}-ZD V Z-Z) e42} 2) > 0, 


i,j=l 


for all choices of n complex vectors z; € Cf in Zi = (27, —z;). Because of (5.135), 
we shall thus show that 


n 
1 1 
5 we Wj ei (V 42 %3)Z)) >0O, where w; := eZ M-32 0V Z) | 


i,j=l 
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Since V + 523 is positive, the same is true of the n x n Hermitean, positive def- 
inite matrix A := [Aj;;], with entries Aj; := ZF (V+ 5%3)Z/j). Then, consider 
the spectral decomposition of A with eigenvalues ag > 0, £ = 1, 2,...,n, so that 
Aij = ys arpapi j where Ye; is the i-th component of the £-th eigenvector of 
A; then, 


2 
>0. 


n k 
> Wi I] Yii Z 


i=1 r=1 


n oo 1 n k 
> w wj eft = ae, ť D [ [4 


i,j=1 k=0 Eiba: ekpl j=l 


Proof of Proposition 5.5.1 If FY (z) = FY (z) in (5.123), then condition (1) in the 
statement of the proposition is satisfied because Tr(p) = 1, while condition (2) is 
fulfilled since we assumed a strongly-continuous representation of the CCR and p is 
a trace-class operator, so that 


o-i 5r 


J 


Z*.A 
(rj le —UIrj)) . 


where r; and |r; ) are eigenvalues and eigenvectors of p. As regards condition 3), 
observe that using (5.123) and (5.80), it turns out that 


> ulujF jy, = Tr(p x" x) >0, 


where X := )-j_, uj exp(Z* - A). 

The sufficiency of conditions 1), 2) and 3) is shown by using them to construct a 
strongly continuous representation of the CCR by Wey] operators W (r) on a Hilbert 
space H and a density matrix p on H such that F C(r) is of the form (5.123) [174]. 
One starts by defining the operators 


(Worp) w) =e 27 w tr), re RS, (5.136) 
on the functions on R?/; these operators satisfy the CCR (5.80): 
(Wo1) Wo(rs)¢)(w) = 27") (Woi + rab) w) . 


Then, one considers the linear span Ko of all functions on R?/ of the form 


Welw) = > cx exp 50 (rk, w)) ; 
k=1 
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for all finite n € N, and defines on it the sesquilinear form 


ny ng 


(Yo, Wen r= (ej te} FE (ry r eT) , 
i=l j=l 


where z;,; are related to r; j via (5.99). Because of the positive semi-definiteness 
of FC, (Wc |Wc)r > 0; thus, in analogy to the GNS construction, one takes the 
quotient of Ko by the kernel consisting of those Wc such that ( Wc | Wc) rp = 0 and 
then its completion with respect to the scalar product defined on the quotient by 
(-|-)#. This gives a Hilbert space K containing the constant function I(r) = 1 on 
IR*f for (1 | 1) = 1; similarly to the GNS vector, | Il) is cyclic for the family of 
operators Wo(r) since 


n n 


vew) = J cx exp Sore, w) = Y c (Woo 1) 


k=l k=1 


and 
(I | Wo(r) |) e = (116730 ) p = F(z). (5.137) 
Further, (5.136) yields 


n 
(Wor) We, ) (w) = Po pe F HN) omiaa , 
k=1 


whence 


i 1 _i 2 
(Wor) Yc, | Wor) We, er = X (CD e1 Ti C2” 
i,j 
stk, 1 2 
x FC; = ri)e gg Net 


—i 1 r2 
= (tG FO} -rpe D 
i,j 
=(W%,|Yo)r.- 


The operators Wo(r) can thus be extended to unitary operators on K where they 
provide a representation of the CCR . If the latter is strongly continuous, then, from 
Remark 5.4.2, it reduces to an orthogonal sum of unitarily equivalent irreducible 
representations. Namely, there exists an irreducible representation of the CCR on 
a Hilbert space H by Weyl operators W (r) and an isometry U : Kh Hl := @,, H 
such that 


Wor) =U'Wir)U, Wir) := Pwo). 
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Also, U| 1) = @,, | Yn ) is a normalized vector in HH, namely >, Yn? = 1, so that 
p := Don An Pn is a density matrix on H, where An := ||Yn |? and Pa := | Yn ) (Yn | 
with Wn := Wn/||Wp||. Now, from (5.137) it follows that 


Tr(p WO) =P h WO) a) = (U1 Wr) UT) 


n 


= (| Wo) |) = F), 


which concludes the proof. In order to show that the representation of the CCR by 
the Weyl operators Wo(r) on K is strongly continuous, we shall first show that the 
condition (1), (2) and (3) ensure uniform continuity of F C(r) at any r. Indeed, 
choosing vectors rı = 0, r2 = u and r3 = w, it turns out that 


1 FC (u) FC (w) 
Fo = | FC (-u) 1 FO (w — u)e™ 27) 
FC(—w) F(u — w)e27™) 1 


its positivity implies F©(—u) = F©(u)*, |F©(u)| < 1 and 
F° (u) — FC (w)| <1- |Fow = w)| 
— 29 | FO)" FC(w)| 1 — F u- were |} 
<4] — FCW — wy ebm) 


Then, the strong continuity of the representation is a consequence of the fact that, 
for all Wc € Ko, the contribution 


(Wc | Wo(r) |Yo)r = SO acy e S(o(rj.r)+o(ri.rj+r)) FC (r; +r-r;) 
i,j 


goes to (Wc | Wc) r when r — 0 in the equality 


Wor) = DI Ye le = 2(( Ye | We) r = A Yc | Wot) [We )r) 


and that this result can be extended to the whole of K. 


Examples 5.5.3 (Two-Mode Gaussian States) 
1. Consider two bosonic modes (F = (G1, 2, P1, P2)) in a state p with Gaussian 
characteristic function, 


1 


FS (r) =e 2) srs (41,92; Pi; Po) - (5.138) 
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With respect to (5.130) (F), = 0, a case which can always be attained by suitably 
translating p. This is more easily ascertained in terms of creation and annihilation 
operators as in (5.129); indeed, if p is such that 


(A) = {Tr(p AD}; =: U = (u, u*) £0, 


consider p := Di (u) p D(u), with D(u) as in (5.101) (see also Example 5.5.2). 
Using (5.102) and (5.123), 


FY @) = Tr(p Dew) e74 Diu) =" FY) 


_Z*. * U — 1 Z* (ye —~17* (ye 
—e ZU 7U 72 (VPZ) _ e A V’2Z) 


It is convenient to rearrange F as R = MF, M = M7! = M”, 


Ti 1000\ A 
=~ {pil [0010 h 
R= ol loroo a (5.139) 
M 0001 M 
Va 
M 
As a consequence, the CCR (5.70) now read 
0100 
A a ; -10 0 0 
[Ri, Rl]=i2j, Q:=MJ M= 0010l? (5.140) 
0 0-10 
and the characteristic function (5.138) becomes 
FC (r) =: G(R) = e7 RAVER) (5.141) 
0001 ; 
a 0010 1 > s pa 
where X; := 0100 and Y = [z(e ' p| mn More explicitly, a 
1000 l 
same argument as the one that led to (5.120) shows that C is a positive real 4 x 4 
of the form 
AC 
y = (& 5) (5.142) 
TRD re, =) 
0< A:= N ae ve (5.143) 
Ore (G1, AD Tre) 
TEW FeR, a 
0< B:= Ad? T ed ge (5.144) 
Ce (2, P2}) Tto) 
TOn) Trip qi a) 
Cs e oe ; 5.145 
Pigq2) Tr(p pi P2) 
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By multiplying (5.120) on both sides by M, the necessary and sufficient conditions 
for V to be the correlation matrix of a gaussian state are 
V 52 >0. (5.146) 


2. Every 2 x 2 real matrix S of determinant | such that S$ J2 S T = Jz, where Jo = 


e a is the symplectic matrix (2.6) for one degree of freedom, can be used 


~ +4 
to define new one-mode canonical operators, 7” = ST, F = a) T := (2). 


[7 . T] = i(J2);;. This fact allows for a greatly simplification of the basic struc- 
ture of the correlation matrix V [138,329]. As a first step, consider a positive, 
real symmetric 2 x 2 matrix X with a := ./det(X) and define S := ./X/a; it 
a0 
Oa 
antisymmetric, VX J2 vX is antisymmetric and thus proportional to J2. Its deter- 
minant is a whence VX Jo VX = a Jz and S J2 ST = Jo. As a second step, use 
this result and let S4,g effect the symplectic diagonalization of the positive, real 
matrices A, B in V, then 


Sa 0\ (alo Č S7 0 
Y= aT T š 
0 Spg C’ Billo 0 Sz 


As a third and last step, notice that the real matrix C can be written as C = 


U ¢ J VT, where C1,2 are its singular values (see (5.16)) and U, V are 


turns out that X = ST ( S. Furthermore, since v X is symmetric and J2 


0 c2 
two orthogonal matrices. If their determinant is 1 then they also preserves J2, 
otherwise set U := Uo3, V := Voz and 


yı 0 paes Ci 0 
0 y2 — 03 0 c2 03 ’ 


where now 71,2 need not be both positive. Therefore, any two-mode correlation 
matrix can be written as [125,329] 


a0 0 

_ (Ua 0 0a0y] (UL 0 

Ges 1 0 B 0 (Son : (5.147) 
0706 
Vo 


5 The above argument is the simplest formulation of a more general theorem of Williamson on the 
symplectic diagonalization of positive, real 2 f x 2f matrices [138]. 
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by means of matrices U4, g such that U4, g J2 UT g = J2. In conclusion, by suit- 
ably changing canonical coordinates, one can always reduce Y to the standard form 


Vo = ct a where Ag =: a Ilo, Bo := G Ila and Co := G ai Then, for 


Vo, by imposing the positivity of the principal minors of Vo + 5Q, the condition 
(5.146) amounts to 


1 
geeks (5.148) 


where I, = a* = Det(Ag), h = 8? = Det(Bo), 13 = 7172 = Det(Co) and 
I4 = (aß — ¥{) (aß — 73) = det(Vo) . 


As the determinants J; are invariant under transformations as those leading from V 
to Vo, that is 1; = Det(A), h = Det(B), I3 = Det(C) and I4 = Det(V), inequal- 
ity (5.148) is necessary and sufficient to ensure that a positive, real 4 x 4 matrix 
Y as in (5.142)— (5.145) be the correlation matrix of a two-mode Gaussian state. 
3. Consider a two-mode Gaussian state p as in (5.117); a necessary and sufficient 
condition such that R,(q1, 92; p1, P2) = O is that the correlation matrix V satisfy 


l4 
yY + — >0. 
23 2 


Indeed, using the argument at the end of Example 5.4.4, from the CCR relations 
(5.80) and (5.113) the characteristic function of pr results 


dx dy 
Ep (r) = [ Sat Ray) x 


x (vac | W(V2(x, —y)) W (q, p) W(V2(—x, y)) |vac) 


dxd : 
= Í. R(x, yyelV24 Y+P*) (yac| W(q, p) vac) 


= eilad f EY ke, yei Zatoa) (5.149) 
Ri m l l 


Because of the argument developed in the first one of the above examples, we can 
assume pp to be a Gaussian function with (7) pr = 9; thus, by Fourier transform 
and using (5.141) the result follows from || R]? = ||r||? = |Iq||? + || pil? and 


2 


rw) = f Se 2u(Z1MR) G(R) e! 
RiT 


, _ir(Sv_liyt 
=f dR -i 2u-(DiMR) o $R(E1V $114) ER) 
R 


4 T? i 


0 Ilo 


where R := (q1, P1; 42, P2) € R, Ma(C) > Xi = G 0 


is the matrix in (5.139). 


) and M4(C) > M 
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Remark 5.5.3 Matrices as those in the second example above form the so-called 
symplectic group for f = 1 degrees of freedom [312]; the symplectic group for 
f > 1 degrees of freedom consists of real matrices S € M,(IR) of determinant 1 
such that preserve the symplectic matrix J f, that is SJ ¢ S Proj f- Setting as before 
T’ := Sr, the CCR are respected, namely [7 , Al = i (J f)ij. Therefore, because of 
the unitary equivalence of the CCR representations (see Remark 5.4.2), there exists 
a unitary operator U (S) on the representation Hilbert space H = L q (R^) such that 


T! = U'(S)FU(S). From (5.77), the Weyl operators transform as 


Ut(S) Wr) U(S) = IF) = IMEEM — (ST ry, (5.150) 
here, if S = C\ with A, B,C MR), tens = ( P C \ while thet 
where, if S = CTB wi , B,C € Mf , then § = BA while the trans- 


z T CT 
posed ST equals - E } 


C AT 

Let p € Bi (H) be a density matrix for the f Bosonic degrees of freedom and 
consider the state ps := U (S) p UÏ (S) obtained by operating the symplectic trans- 
formation of the canonical operators; because of (5.150), their characteristic func- 
tions (5.123) of p and ps are related by E,,(r) = E,(S'r). With r = (q, p), u = 
(x, y) € R?/, (5.149) generalizes to 


2 du 
=e lrir/4 V2u-(Z1r) 
acai fe Gms heme 


where £; := ( et ), whence the corresponding functions R,,(u) and R, (u) 
ly Of 


in the P -representation (5.117) are related by 


ae as Rog a) i VEED = 
R2f (AT) 


— oo IS? 72/4 du es suis 
=¢ J ay ES We , (5.151) 


5.5.4 States in the Algebraic Approach 


As we have seen in Example 5.2.4 and Remark 5.2.4, expectations as in Defini- 
tion 5.5.1 provide semi-norms that equip B(H) with a w* topology which is equiv- 
alent to the o-weak topology. On the other hand, in Sect. 5.3.1, states have been 
defined as positive, normalized linear functionals on C* algebras which are contin- 
uous with respect to the uniform topology of BCH). Since this topology is finer and 
thus has more open subsets than the o-weak one, in general expectations need not be 
also o-weak continuous. The following result [80] characterizes the space of density 
matrices within the more general space of states on B(H). 
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Proposition 5.5.2 If M C B(H) is a von Neumann algebra with identity, all o- 
weakly continuous expectations on M have the form M 3 X |> Tr(p X), with p € 
Bı (H) a density matrix. 


Proof From Example 5.2.4, any o-weak functional F : M +> C takes the form 
F(X) = X, (Yn | X lon) where {Yn} and {@n} are sequences of vectors in H such 
that >, Øl? < 00 and X, |Id||? < œœ. 

Set | Y) = @, IYn), 19) = Pal on) and consider the following representa- 
tion of B(H) on their Hilbert space H, 7(X)| Y} = @,,(X| Yn )), X € BCH). Then, 
F(X) = (% | 7(X) |d). If F is positive and M 3 X > 0, then 


lia FOE a i D le 
F(X) = 7((D+ 31 (X54 8) -p-r -)) 
(b+ | 7(X) b+). 


= 


Ale 


By considering the GNS construction based on the state vector | w + P) (once nor- 
malized), Remark 5.3.2.3 implies the existence of 0 < T’ = (S’)'S’ < 1/4 in the 
commutant of 7(M). Since S’ maps H into itself, 


F(X) = (04+ ITX 1H 4 6) = (S40 | m(X)|SGU4+H) 
=) \(xn|XIxn), VX EM. 


Set p:= >>, | Xn)(Xn |; this operator is positive. If F(1) = 1, then Tr(p) = 1, 
whence p € By (H) and F(X) = Tr(p X) for all X € M. 


Remark 5.5.4 ([80]) As functionals, density matrices are normal as their c- 
weak continuity is equivalent to the property of normal linear maps outlined in 
Remark 5.2.8. 


Because of the convexity of the space of states (see Remark 5.3.2.5), one has that 


Proposition 5.5.3 The space of states S(S) of a quantum system S is convex and 
a same density matrix can in general be decomposed into infinitely many different 
convex combinations of other density matrices, unless it is a pure state which is thus 
extremal in S(S). 


Proof Take any set of 0 < X; € BCH), j € J, such that ie X; =I, as p> 0, 
in terms of its spectral decomposition p = >, rel Yk )( Ve |, its unique (positive) 
square-root is given by \/p = } 4 /Txl Gk )( Ok |. Then, 


Pl y 49 
p= APF. P= àj it 
jeJ 


, Aj =Tr(pX;). (5.152) 
0 otherwise i l 
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Thus the same density matrix p describes mixtures whose components are described 
by density matrices pj; in turn, these can also be decomposed unless they are pro- 
jectors, as, in this case, p = | Y ) (4% | implies 


VX jJ0 = (VIX) IY NY 


so that p; = p for all choices of X ; such that (Y | X; |) 4 0. 


Example 5.5.4 ([{187]) Consider two generic decompositions of a same density 
matrix p € S(S) into non-orthogonal one-dimensional projectors, 


p= Valto (dp l VP, vale (ql VP. 


q=1 


| Wp )( wp | |Zq {zq | 


with ae lp ){ vp | = land E i! %¢ ie | = 1. By means of the spectral rep- 


resentation p = yy rj|rj)(rj |, setting | vj ) /r; |r; ), one gets 
N N 
[py Slee) ee. lza) =D ilga) lvy): 
j=l i = eh aol 
(W") jp (Z*) jg 


The P x N matrix W : C > C? with entries Wpj := (Wp Irj) and the Q x N 
matrix Z : CN + C2 with entries Zqj := (q |rj) are such that WW =1yn 
and Z*Z = 1y. Then | ve) = X21 Wpel wp) and | zg) = Ep1 Vpal Wp), Where 
V = WZ : C2 > C’. Thus, any two decompositions of p into projections are 
related by a P x Q matrix V such that VVi = WW' and VÝV = ZZ’; therefore, 
if P = Q = rank(p), then W, Z and hence V are unitary matrices on the support of p. 
Also, if P = rank(p), but Q is arbitrary, then V is an isometry such that V'V = 1y. 


In Sect. 5.3.1, states on C* algebras have been used to construct Hilbert space 
representations; in the present setting, a representation on a concrete Hilbert space 
H is a priori given, it is nevertheless instructive to consider pure and mixed states 
of a finite dimensional quantum system S. In such cases, the GNS representation 
amounts to what in quantum information is known as mixed state purification. 

Let p € My (C) be a density matrix and consider its spectral representation p = 
DL 1rjlrj){rj |, Some eigenvalues possibly being equal to zero. To p one associates 


the state vector | ,/p) € CY @ CN given by 


val rj) @\|r;) (5.153) 


j=l 
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Given X € My (©), let it be represented by 7(X) = X @ ly on C" & CY, 


N N 
XOINI Jp) =Y V(X) @lry) = Yo FIXI) @ I rj) 


j=l j,k=1 


=|X/p), (5.154) 
whence (./p|X @ lw IVP) = (/p|X Jp) = Tr(p X). 


5.5.4.1 Pure States 

If p=|v (pl, pe CN, then | VP) =|) 81V) € CY @C™ and m(X)| Yp) 
= X| Y) @ |v), for all X € My (C), whence the GNS Hilbert space is (isomorphic 
to) C™. Indeed, by taking the quotient of My (C) with respect to the set 


T:= [x € MC) ; (YIX X) =o}, 


the equivalence classes | wy ) are identified with operators of the form X| Y } (4 |. By 
varying X, the GNS Hilbert space H, is generated by vectors | ¢)( ~| forall ¢ € CN 
and is thus isomorphic to C^. It follows that the GNS representation 7,(My (C)) is 
unitarily equivalent to My (C) and thus irreducible in agreement with Remark 5.3.2. 


5.5.4.2 Faithful Density Matrices 

At the opposite end with respect to pure states, let us consider a density matrix 
p € My (C) with eigenvalues r; all different from zero, so that Tr(p X t X) = 0 4 
X = 0. According to Definition 5.3.5, p is faithful. 

Matrices X € My (C) becomes N?-dimensional vectors whose components are 
their matrix elements with respect to the ONB consisting of the eigenvectors of 
p,|X)= ys (ril X |rj)|ri) @ |r; ). Also, by varying X € My (C), the linear 
span of vectors of the form | X ,/p ) is dense in CY @C’.Lety = Er Yijlri) ® 
Irj) € C&C” and X =|rp)(rq|, then (Y |X @ ly | VP) = V4q./7q- There- 
fore, if w is orthogonal to the linear span of | X./p) then, Va = 0 for all p,q as p 
is faithful. 

Because of Remark 5.3.2.1, the triplet (CY @ CY, 7, | /P)) is unitarily equiv- 
alent to the GNS triplet (Hp, Tp, 2p) corresponding to the expectation functional 
wp : My(C) > X |> w,(X) := Tr(p X). 

The matrix algebra My (C) is represented by My (C) ® Ily on Hp; so, its com- 
mutant is m (My (C)Y = 1y ® My (C), 7)(My (C)) has trivial center and is thus a 
factor. The action of the commutant is given by 


N 
Iv XIP) =} FFP Xlr) = IVx"), (5.155) 


j=l 


where XT denotes the transposition of X with respect to the eigenbasis of p. 


228 5 Quantum Mechanics of Finite Degrees of Freedom 


We can now look at the decomposers in (5.152) from the point of view of 
Remark 5.3.2.3. Given a convex decomposition p = ie yp AjOj, every oj Cor- 
responds to a unique 0 < X f in the commutant 7(My(C))’, thence to a unique 
0 < X; € My (©), such that Xx’ =1y®8 X7 and 


djoj(X) = (P| m(X)X' I/P) = (VP |X @ X$ VP) 
= (/p|X/pX;) = Tr(VP X; VP X), (5.156) 
whence A; = Tr(p X;) and o; = (VP X; ./p)/Aj- 


Example 5.5.5 Consider a two-level system equipped with the density matrix p = 


1/1-s 0 ; : : 
z ( 0 1+ J 0 <s < l; as GNS vector we can take its purification (5.153) 


= 
| VP) = 10) @10) + 


where |0), | 1) are the eigenstates of p. The corresponding GNS representation is 
T)(M2(C)) = M2(C) 8 lly with GNS Hilbert space C4, 


21) @]1) 


1 
(BIX @ LIP) => (01x10) + F211 x11) = TP), 


for all X € M2(C). The commutant is 7,(M2(C))’ = l2 & M2(C) so that 7p is 
reducible and 7,(M2(C) is a factor since 7,(M2(C))" = m,(M2(C)) with trivial 
center (see Definition 5.3.4): Zp = 7 )(M2(C))’ N T (M2(©)) = {All}. 


5.5.4.3 Modular Theory 
The GNS state of any faithful density matrix is separating for 7,(My (C)), namely 
T(X)| VP) = 0 <> X = 0, and thus cyclic for the commutant 7,(My (C))’ (see 
Lemma 5.3.1). 

We shall now give the fundamentals of the so-called modular theory that looks 
particularly simple for finite-level systems. 

Let p be a faithful states and identify its GNS triplet (Hp, Tp, 2p) with (CX @ 
C’, My(C) ® In, | /p)). The so-called modular conjugation is the antilinear map 
Jp: CN @ CN e CN @CN such that 


N 


N 
=o bili) Bly) e Hid) = Yo valr{)@lri). (5.157) 


ij= 


= 


i,j=1 
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It satisfies J; = Il and Jp| VP) = | /p); furthermore, 


N 
JA XSP) = Jp (X @ lw) Ipl MP) = X VIX rir) lri) 


i,k=1 
= Iy @ X*|./p) = | VPXŤ) , (5.158) 
where X* is the conjugate of X with respect to the ONB {|r; Wip that is 


(rk | X* |rj) = ((rk|X [rj ))*. Given X, Y, Z € My (©), one explicitly computes 


[x D 1N), YQ ty] Ped pics 

= JX 8 Uy) Jp Y ZVP) — Y 8 Un) Jp(X 8 Uy)| PZ") 

= JI X/p(¥Z)') — Y 8 Iy)| Z/pX") 

= |YZ./pX') —|Y¥Z/px') =0. 
Thus, J,(X & Ily)J, belongs to the commutant 1y ® My(C) = 7,)(Mwn(C))’ of 
My (C) 8 ily = Tp(My (C)); since the GNS vector | ,/p) is cyclic for 7,(My (C)) 
it is separating for m (My (C))’. Therefore, from (5.158), 

J X @ In Jp = In @ X*, (5.159) 


whence J, antilinearly embeds m (My (C)) into its commutant m, (My (C))’. Actu- 
ally, the embedding is an anti-isomorphism, 


JpT (Mn (C)) Jp = Tp(Mn(C))’ . (5.160) 


Indeed, using (5.155), for any S’ € 1,(My(C))’ and Z € My (C), it holds 


1 1 
SIAP) = Ys /P) =| VOB Y§ =) = In (VS VP) @ Mw Jol VP) 


where the first equality is due to the fact that S’| ./p) is a vector in H, which can be 
obtained by acting on the GNS vector with some m,(Y s). Then, 


s=4(= $ Vp) @ ly Jp. 


A related notion is that of modular operator 
Ap:=p@p', (5.161) 


which, according to (5.158), is such that, 


1 
JAPI XAP) = Ip JP ® Wr XSP) = Jnl 4/pX) =|X'./p). (6.162) 
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5.5.5 Density Matrices and von Neumann Entropy 


From the point of view of the GNS construction, pure states can be distinguished 
from mixed states because their GNS representations are non-irreducible factors for 
the latter, while they are irreducible factors for the former. There are however handier 
ways to sort these states out; perhaps the easiest is to consider p*: if more than one 
eigenvalue of p is non zero, then p is mixed since then p? Æ p. Indeed, the spectrum 
of a mixed state p has a richer structure than that of any one-dimensional projection. 

The possibility of comparing, in some cases, the eigenvalues of two density matri- 
ces p1,2 € Bi (H) comes form the so-called minmax principle of which we give a 
short sketch (for a general formulation and proof see [297]). We shall consider 
the eigenvalues listed in decreasing order, namely, in the spectral decompositions 
; (H) > p= };; ri |ri)(ri |, with eigenvalues repeated according to their multi- 
plicities, we shall take r; > rj+, for all i. Because of the ordering, it turns out that 


ri = sup (y lolh) + ol =1, 1b) LAr) r)i} 6163) 


Indeed, for a w specified as above, (w|p|w) = ei rel% re) [2 < ri and r; is 
achieved by choosing | 7) = | r; ). The minmax principle asserts that 


n= inf supf (WL le) + ll = 1, 18) L161), 162) eo liD, 
i (5.164) 
where {bin is any set of vectors in H. 


In order to prove this relation, we denote by U,({@ jha the argument of the 


inf and show that U pei) > ri, whence the result follows for r; is achieved by 


choosing |; ) =|r;), J = 1,2,...,i—1. Now, there surely exists a normalized 
vector |W) = X} cel re) L Cayat indeed, if P projects onto the linear span of 
{rj Pai the vectors {P| ġe T= span at most an (i — 1)-dimensional subspace. 
But then, 


i i 
Up{d Ving) = (1p) =$ relel z ri dole? = ri. 
k=1 


k=1 


From the minmax principle, it follows that p1 > p2 => e;(p1) = e;(p2), where e; (p) 
is the i-th one in the ordered list of eigenvalues of p. Further, for generic p1,2 € 
i (H), the minmax principle provides an upper bound to the differences |e; (p1) — 
e;(p2)| in terms of the trace-norm (5.21). 


Example 5.5.6 Given p12 € Bi (H), decompose pı — p2 = R} — R_, where R+ 
are positive orthogonal operators, so that 


le: — p2llı = Tr(Ry + R-) = Tr(2R — p1 — pz), 


5.5 Quantum States 231 


where R := pı + R- = p2 + R+ > p1,2. Let ri, respectively a be the eigenvalues 
of R, respectively 1,2 listed in decreasing order; then, by the minmax principle, 
rj =r; forall i. Thus, 2r; =r} =r? > |r} — r= Dlr} — 771 < lla — pall. 


The spectrum of a density matrix is a classical probability distribution: this hints 
at the possibility of quantifying its information content of by means of the Shannon 
entropy of such a distribution. This leads to the notion of von Neumann entropy of 
a state p € BY (H) [268]. 


Definition 5.5.2 (von Neumann Entropy) Given p € S(S) with spectral decompo- 
sition (degenerate eigenvalues are repeated according with their multiplicity and 
with chosen orthogonal one-dimensional eigenprojectors) p = )~ j ri lry)(rj |, the 
von Neumann entropy of p is the Shannon entropy of the probability distribution 
corresponding to its spectrum: S(p) = — Tr(plogp) = — ay rj logr;. 


The following are some of the properties of the von Neumann entropy: they show 
that it plays a role similar to that of the Shannon entropy in a classical context. Other 
properties more related to composite systems will be discussed in Sect. 5.5.6. 


Proposition 5.5.4 Let p € i (H) be a density matrix. If H = C™, then the von 
Neumann entropy is bounded by the entropy of the state ly /N: 


0 < S(p) < log N . (5.165) 


The von Neumann entropy is concave; that is, given weights Ai > 0,i € I, Xie Ai = 
1 and density matrices pi € Bi Œ, 


DASD = S (x wa) <SS D + YOn, (5.166) 


iel iel icl iel 


where n is the concave function (2.84). If H = CN, the von Neumann entropy is 
continuous on i (H) with respect to the trace-norm; namely, if p12 € : (H) are 
such that \|p, — p2\\1 < 1/e; then they satisfy the so-called Fannes inequality 


|S(p1) — S(p2)| < lle1 — pallilogN + ndl — pala) - (5.167) 


Proof Since S (p) — log N = — > ri(logr; — log 1/N), boundedness follows 
from (2.85). The lower bound in (5.166) comes from the concavity of 7(x); 
indeed, let r; and |r; ), respectively ri and | ri ) be eigenvalues and eigenvectors 
of p := J je, Ai pi, respectively p;. Then, 


=J Allal) = DOM DT ree 


icl icl k 
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with Xer g Ai I(r} Irj)? = 1. Therefore, 


S(p) = dn) = >. Aril Pad = ri Domed) = Dox S (pi). 


icl,k,j iel iel 


On the other hand, from Example 5.2.3.7, 
Api < p => Aipi logOipi) = Aipi log pi — Ni) pi < Ai pi log p . 


Fannes inequality follows from the fact that |n(u) — n(v)| < n(ļu — v|) if |u — v| < 
1/e and from Example 5.5.6 which implies that 


N 


1 2 
sk = Ire — gl < Do sr =: S < loi — pall < Me, 
k=1 


where a are the ordered eigenvalues of p1,2. Then, 


N N N 
IS (p1) — SD = > MeD — nG~| < Do nw) = sS yin + (5) 
k=1 k=1 k=1 
< llei — palli log N + nlli — p2ll1) , 


for n(x) increases when x € [0, 1/e]. We complete the proof by showing that 
In(u) — n(v)| < n(ļu — v|) indeed holds when |u — v| < 1/e (see [264]). The func- 
tion f(x) := n(x + (u — v)) — n(x) decreases for u — v > 0, thus f(0) = n(u — 
v) > f(v) = nl) — Hv). If u—v < 1/e, ņn(u — v) > u — v, while the increas- 
ing function g(t) := t + n(t) gives g(u) = u + ņn(u) > v + ņn(v) and thus 7(u) — 
n(v) > v — u which implies ņn(u — v) > n(v) — nu). 


Remarks 5.5.5 1. The second inequality in (5.166) becomes an equality if and 
only if the ranges of the matrices p; are orthogonal to each other; indeed, in 
such a case the eigenvectors of different p;’s are orthogonal so that their spectral 
decompositions give the spectral decomposition of p, 


p= > dirkirk yk | = S (0) = Ñ dirk log0ur®) . 
i ik 


2. In the case of an infinite dimensional Hilbert space, the von Neumann entropy is 
only lower semicontinuous: if a sequence of density matrices on tends to a density 
matrix o in trace norm, then S (a) < lim, S (con), in general. As an example [353], 


1 1 
take on := (1 — —)p + — pn, where S (p) = Oand S (pn) increases like n. Then, 
n 
llon — plli < 2/n — 0 when n — +00; also, by (5.166), 


1 1 1 
S (on) = (l= =) S (0) + —S (pn) = —S (pn) > c > O0=S() . 
n n n 
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The least mixed states, the pure states, are 1-dimensional projectors for which 
rı = | while all other states have rı < 1. This suggests the following 


Definition 5.5.3 ([353]) A density matrix pı € S(S) is said to be more mixed than 
another density matrix p2 € S(S), p1 > p2, if their decreasingly ordered eigenvalues 
ei(p1,2), j = 1,2,...,d, satisfy 


k 


k 
Dele) = Dele), RSL Z: 
i=1 


i=l 


The relation > is a total ordering among density matrices of two level systems; this 
is because e1 (p1,2) + €2(p1,2) = | for any pair of density matrices p1,2 € M2 (©). 
Therefore, p1 > p2 == e1 (p1) < e1 (p2). 

Unfortunately, for higher dimensional systems > is only a partial ordering; for 
instance, consider the following density matrices p1,2 € M3(C), 


1/2 0 0 2/3 0 0 
p=ļ| 0 1/20], =| 0 1/6 0 
0 00 0 0 1/6 


Then, e1 (p1) = 1/2 < e1(p2) = 2/3, but e1 (p1) + e2(p1) > e1(p2) + €2(p2). 
The following proposition provides a helpful tool that allows, in some cases, to 
establish whether two density matrices are in the > order. 


Proposition 5.5.5 (Ky Fan Inequality) Given p € B i (H), it holds that 


k 
Ye = max | Tr(Pp) : P?=P=P', dim(PH) = k| (5.168) 
j=l 


Proof Let {| ¢; Viet be an orthonormal set in H, K the subspace they generate and 
P= Xi |; )( @; | the corresponding orthogonal projector. If {| Y; a is any 
k k 
other ONB in K, then X °( 6; | p|;) = X (4; |p |W). Let |r; ) be the eigenpro- 
j=l j=l 
jectors of p corresponding to the eigenvalues e; (p) =: r; listed in decreasing order 
and consider the subspace spanned by {| 7; ase A same argument as in the proof of 
the minmax principle (5.164), ensures the existence of some | Yx ) € K orthogonal 
to it; analogously, there must exist 


lyk-1) EK 1 {ri}? Ul ed} 
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and so on. Thus, one collects {| Yj YÉ € K such that 


lW) L {lri),..-s rj) jar)... k) 


and (5.164) yields Tr(P p) = Eja (Vi lP Wi) < Ei eio). 
Examples 5.5.7 


1. Given any p € S(S) and unitary matrices U; € My (C), j € J, together with 
weights 0 < Aj < 1, J jez àj = 1, set P := Dies Àj Uj pU}. If p= iri 
|r; )(r; |, the unitarily rotated matrices Uj pu; have the same spectrum for 
N 

Uj; p Ui = > ril ři )(F; | and |F; ) := U| r; ) form an ONB . Further, their con- 
i=l 

vex combination P > p; indeed, let P be the projector achieving the maximum in 


(5.168), Z4] ei) = Tr(PP); then 


k k k 
Da = VATU PU <= OLA) VaO= ia, 
i=l i=l 


jel jeJ i=l 


for the projectors ut P Uj need not achieve the maximum in (5.168). 


2. Consider the convex set Bi (H) of all density matrices of a system S described by 
a Hilbert space H, not necessarily finite dimensional, and let Sora (S) be totally 
ordered: the most mixed p € Sora (S) have the largest entropy [353,372]. Indeed, 
pı = p2 => S(p1) > S(p2). In order to show this, set a; := e; (p1); then, for all 
N > landay > 0, 


N-1 k N 
yO ai) (log ax — logag41) + logan = Yo ax log ax . 
k=l i=l] k=1 


Set 6; := e;(p2); since p1 > p2 => Yri Qi < yy Bi for all k > 1, using 
(2.85), one finds 


N—1 k 


N 
5 —ax logak > — Os Bi) (log ax — log ak+1) + logan 
k=1 


k=1 i=l 


N N N 
— SF A logax = — J Glog He + X Ge — ox) 5 


k=1 k=1 k=1 


for all N > 1, whence S (p1) > S (p2), for X `; Ok = X} pak = 1. 


5.5 Quantum States 235 
5.5.6 Composite Systems 


In quantum information, physical systems S consisting of several subsystems, S = 
Sı + S2 +--+- Sn, are called multi-partite. 

If each of the constituent subsystems is described by a Hilbert space H;, the 
Hilbert space of S is H™ = ®)'_ Hi and its observables are Hermitean elements 
of the C* algebra BH™) = @j_, BHI). 

Given a multi-partite state p € S(S), marginal states pj,i,...;, for all possible 
choices of subsystems S; + Si + ...-+ S;, are obtained by partial tracing over the 
Hilbert spaces He whose indices are different from the selected ones i1, i2,..., ik, 
namely 


= 


Piyin--i, = Tr j,Trj,---Trj,_,(—), Je £ii i2... ik, €=1,2,...,n—-k, 


where Tr j(p) = X vy Q) |p wy! ) ) denotes the trace computed with respect to any 
ONB {| we ) )} € Hj and yields a density matrix acting on the Hilbert H”) = 
aie; Hi. 
In particular, the states of bipartite systems S = S1 + S2, are described by density 
matrices p12 € i (H®) with marginal states 


2 2 1 1 
pı = Tr2p12 = Ss lpn Iys ), p= Tron = X P | pia Iw Ny 


J J 


Proposition 5.5.6 The marginal states of any pure state p12 E€ S(S, + S2) have the 
same eigenvalues with the same multiplicity, apart from the zero eigenvalue, and thus 
the same von Neumann entropy. 


Proof Let | 712) € H be the vector onto which the pure state p12 projects and 
D | r® ), the non-zero eigenvalues (repeated according to their multiplicities) and 
eigenvectors of the marginal density matrix pı = Tr2p12. Using the corresponding 


ONB {| p )} in H; and any other ONB {| oP )} in H2, one can expand 


bia) = oC pal?) @ 14) = 1) @ 13), 
ik J 


where | gP) ) =}; Cjl p? ) need not be either orthogonal or normalized. Then, 
D,a 1 2 2 1 1 
pr orp EA ER E I E L, 
j j,k 


whence (9 |90 ) = jer”. Setting |r) := 190 )/,/r\? yields 


| 12) FÈN P TOI: (5.169) 
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It thus follows that p2 = X; rP |r y(r? |, whence S (p1) = S (p2). 


j j 

Remark 5.5.6 The expression (5.169) yields the so-called Schmidt decomposition 
of bipartite pure states | Y⁄12) € Hı ® H? into a linear combination with positive 
coefficients of tensor products of equally indexed states from two ONBs of the two 
subsystems. The degeneracy of the 0 eigenvalue accounts for the possibly different 
dimensions of the Hilbert spaces H 2. 


Example 5.5.8 Let a bipartite system S = Sı + S2 consist of a 2-level system Sı 
and an N-level system S2. Let DAHA ı E€ M2(C) be a set of matrices such that 
y xx = Il,, consider a fixed vector Y € C? and the state vector C? @ C’ 5 
|W) i= PG XilY) |i), where {|i Y is an ONB for S2. The normalized 
vector | Y ) yields marginal states 


N 
M2(C) 3 p =T YY D= >) Xi vp 1X} 
i=l 
N 
My(C)3 po =TilY)( I) = $ (WI XIX WwW) SI. 


i,j=l 


Letra, |a),a = 1, 2, be the eigenvalues, respectively eigenvectors of p1; by expand- 
ing X;|v) =J 2 ciala), it turns out that |W) = > ?_,|a) @| da), where 
| Ba = Se Cia |i). The vectors | dg ) := | ba )/IPall are orthonormal and the 
0 eigenvalue of p2 = r1| 01 )( 1 | + r2| 62 )({ G2 | is (N — 2)-degenerate. 


We end this section with a list of properties of the von Neumann entropy which 
pertain to composite systems. 


Proposition 5.5.7 Let S = Sı + S2 be a composite system with Hilbert space 
H = Hy ® Ho, Hy .2 of dimension dı 2. The von Neumann entropy is additive on 
product states i (H) > p2 = pi 8 pọ, 


S (p12) = S (p1) + S (p2) . (5.170) 


Given p12 € Bi (H), let Bi (Hı) > pı := Tr2p12 and Bi (Hz) > p2 := Tr1 p12 be the 
marginal states. Then 


|S(p1) — S(p2)| < S(pi2) < S(p1) + S(p2) ; (5.171) 


The second inequality expresses that von Neumann entropy is subadditive; more in 
general, the von Neumann entropy is strongly subadditive. Namely, let S = Sı + 
S2 + S83 be a tripartite system described by a state p123 € Bi (H) with H = Hy & 
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Hz ® H3. Loring cyclic permutation (i, j, k) of (A, 2, 3) let BY 1 H; 8 Hj) 5 pij := 
Trxp123, By (H;) > pi := Tr jkp123 be marginal states. Then 1228, 372] 


S(p123) + S(pj) < S(pij) + S(pjx) - (5.172) 


Furthermore, the differences between the von Neumann entropies of p12 and of the 
marginal states p12, S (p12) — S (71,2), are concave: 


Ear) - 8 [Sore | = Eal (H) - s (P). 6173 
j i j 
where Aj > Oand `; Xj = 1. 
Proof Additivity comes from the fact that the spectrum of p12 = p1 ® p2 consists 
of the products of the eigenvalues of pı and p2. 


Assume strong subadditivity holds and let pag = }-; poB paR X oe | be a den- 
sity matrix on H4 ® Hg and 


PAB ) =% rAB rf?) @ |rf” ) c Ha 8 Hs) 9 (H4 @ Hp) 


the corresponding GNS state. Set Hı := H4, Hz := Hpg, in the first factor, H3 := 
Ha & Hg in the second one and 


p123 =| PaB )( Caa ee råB ere re | 


Then, p3 = Tr12/9123 = paAB = 113/123 = p12, therefore, pı = Tr239123 = Pa, p2 = 
Tr130123 = pg. Also, because of purity, S(p123) = 0, thus (5.172) and Proposi- 
tion 5.5.6 yield 


S(paB) = S(p3) < S(p13) + S(p23) = S(p2) + S(p1) = S(pa) + S(pB) 
which implies subadditivity. The lower bound in (5.171) follows instead from 


S(pa) = S(p1) < S(p13) + S(pi2) = S(p2) + S(p3) = S(pB) + S(pas) 
S(pB) = S(p2) < S(p12) + S(p23) = S(p3) + S(p1) = S(paB) + S(pa) - 


In order to prove strong subadditivity, we introduce the guantum relative entropy 
of two density matrices p and ø (see Definition 6.3.1) 


S(p, o):= Tr(p logp — p logo) 


which is well defined when o| Y) = 0 => p| 4) = 
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We shall prove in Sect. 6.3 that S (p, a) does not increase under maps like the 
partial traces, thence 


S Plies = S§(Tr T(=! @ ) <5 T 
p12, dı pR] = 3pP123 , 113 d p23) ļ < p123, dı p23) - 
On the other hand, 
l 
S pir, g 8P = —S (p12) + S (p2) + logd) and 


ll 
S (om. a Q px) = —S (p123) + S (p23) + log dy , 


whence (S (p123) = S (ox3)) - (S (e12) -= S (02) <0 


The concavity (5.173) follows from another property of the relative entropy which 
shall be discussed in Sect. 6.3, namely its joint convexity: 


Ta, YAP | < Says (p, ot) , 
j j j 


where A; > 0 and par, Aj = 1. Let p% := pe and oJ) := pw es => = with pj’ = ; 


Tr2 pe ; then, 


Dove» De D @ z 
j 
-5 die +5 Drie + logd 


DD s (ol; pP g B) = 2 (s (0 ) - s(p P)) +ga. 


j 


5.5.7 Entangled States 


One of the most puzzling and fascinating aspects of quantum mechanics is its non- 
locality embodied by the concept of quantum entanglement [183]. This is a property 
of certain quantum states of composite systems, called entangled, which are such 
that their constituting subsystems cannot be attributed properties of their own, not 
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|x) 


ly) 


|Vey) 


Ucnor 


Fig.5.4 Bell states 


even with a certain probability. As a paradigm of such states, consider the following 
state vector of two spin 1/2 particles (the simplest instance of the state vectors (5.18) 
in Example 5.2.3.9), 
|00) +] 11) 
| oo) := ———=—__, 
v2 

where |0) and | 1) are eigenstates of the Pauli matrix o3. By looking at the corre- 
sponding projector, 


(5.174) 


1 
| Yoo) Yoo! = 5 (10)(01&10)(01+ 111] @11)(11) 
1 
+ 5 (1011 @ 10) 11+ 11)(0111)(01), 


one sees that, while the first line is the density matrix of an equally distributed mixture 
of both spins pointing up and down along the z direction, the interference term in 
the second line forbids to attribute these two properties with probability 1/2 to the 
component spins. By rotating the orthonormal projectors (1 + o3)/2 into any two 
other orthonormal pairs (Il + nı -o)/2 and (1 + n2-o)/2, the same obstruction 
occurs along any two directions n1 2. 


Example 5.5.9 (Bell States) [266] The symmetric vector (5.174) is the first one in 
the so-called Bell basis of C? ® C? of which the others read 


-o (LEI). ce. (OOS ITT). <. doli 
ee aa s= a 5A g 


In quantum information 2-level systems are called qubits, unitary actions on them 
quantum gates and nets of unitary gates quantum circuits. The Bell states can be 
created out of a separable pure state of two qubits by means of local and non-local 
operations, according to the quantum circuit in Fig. 5.4. 

The input vectors | x ) and | y )}, x, y = 0, 1, are members of the so-called computa- 
tional basis; | x ), called control qubit is subjected to a Hadamard unitary rotation (see 
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Fig.5.5 CNOT gate 


|x) |x) 


ly) ly © x) 


Ucnor 


(5.60)), Uy, and then, together with | y ), called target qubit, to a so-called Control- 
Not unitary gate, Ucyor. The first transformation affects one of the two qubits only 
via the matrix M4(C) > Uy ® ll and is thus local; the second one involves both 
qubits in a non-local way. Indeed, the unitary matrix Ucyor € M4(C) implements 
the classical CNOT gate, CN OT (x, y) = (x, y ® x), that acting on pairs (x, y) of 
bits leaves the control bit unchanged and adds it to the target bit: 


CN OT (00) = (00) CNOT (01) = (01) 
CNOT (10) = (11) CNOT(L1) = (10) 


If we substitute pairs of bits with tensor products of computational basis vectors, the 
Pauli matrix flips the | x ) so that the same relations are implemented on C? & C? 
by 


I 0 
Ucnor =o +L ea = (6 n (5.175) 


Let EAS := UcnorH & I|xy); by a Hadamard rotation (5.60) one gets 
(Fig. 5.5) 


i i he  |0,¥)+(-1)*]1,y@1) 
Wey —i = —1)" ’ = 3 
| Wey) Br )*|i,y i) WA 


and, by varying x, y € {0, 1}, one obtains the Bell basis. 


Entanglement is a purely quantum phenomenon, with no classical counterpart; 
it has from the start attracted a lot of scientific and, unfortunately, also pseudo- 
scientific interest; one of the great merits of quantum information is to have promoted 
entanglement to the status of a physical resource for performing informational and 
computational tasks otherwise impossible in a purely commutative context. 

In the following, we shall mainly focus upon bipartite discrete quantum systems 
consisting of two parties described by means of finite dimensional Hilbert spaces Ch 
and C”, respectively. Within the state-space S (S1 + S2), one distinguishes separable 
from entangled states. 
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Definition 5.5.4 (Separable and Entangled States) A density matrix p € S(S| + 
S2) is separable if and only if it can be approximated in trace norm by a linear convex 
combination of tensor products of density matrices: 


p= > Aii pi, ® i, » Ain ZO, 2 Aih = 1. (5.176) 


(iy ,i2El xh (i, ,i2)El, x In 


Those p € S(S1 + S2) which cannot be written in a factorized form as in (5.176) are 
called entangled or non-separable states. 


Remark 5.5.7 Pure separable states are of the form | Y ) = | Y1 ) ® | %2 ) for some 
w1,2 € C2, otherwise they are entangled. The set Ssep(S) of separable states of the 
bipartite system S is the closure of the convex hull of its separable pure states (see 
Remark 5.3.2.5). 


In order to judge whether a pure bipartite state is entangled or separable is sufficient 
to look at its marginal density matrices. 


Proposition 5.5.8 A state vector | 2) € C” @ CË of a bipartite system Sı + S2 
is separable if and only if its the marginal states p\,2 are pure. 


Proof If | 2) is separable, its projector is the tensor product of two projectors and 
partial tracing yields one of them. Vice versa, if the marginal density matrices are 
not projectors, then the Schmidt decomposition (5.169) contains more than one pair 
and | W12 ) is entangled. 


The structure of separable states is apparent from (5.176): they can be obtained 
by mixing with weights );,;, otherwise independent states of S4 and S2, the only 
possible correlations between them being those relative to the probability distribution 
{Ai,i.} associated with the weights and thus of purely classical nature. Instead, pure 
entangled states carry correlations that are purely quantum mechanical. 


Examples 5.5.10 
1. Consider a bipartite system consisting of two d-level systems in the state (5.18) 


which generalizes the two qubit symmetric state (5.174). From partial tracing the 
projector P! = 12$) (P$), one gets 


C2 1 
aaas g Eg 


namely the totally mixed state for both parties. Thus, the von Neumann entropy of 
the bipartite state Pe j is smaller than that of either its components, 0 = S (P! )< 
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S (p1,2) = log d, which is maximal, instead. In other terms, the information con- 
tent of the entangled pure state pe is smaller than that of its constituent par- 
ties. This holds for all entangled pure states. In order to see this, one uses the 
Schmidt-decomposition (5.169) and Proposition 5.5.6: if p12 = | W12 )}( V2 |, then 
S(p12) = 0, and S(p1) = S(p2) = — = rl log, r; > 0 unless p1,2 are pure 
states and thus p12 separable. 

2. The previous observation makes the statistical properties of pure entangled states 
incompatible with classical ones; indeed, by (2.91) one knows that the Shannon 
entropy of a bipartite classical system (described by two random variables) cannot 
be less than that of any of its marginal distributions. This classical behavior is 
characteristic of all separable states; namely, the von Neumann entropy of all 
separable bipartite states cannot be smaller than the von Neumann entropy of 
their marginal states. This fact follows from (5.173); in fact, consider a separable 
state 


p12 = 2 Aye & p € Bi (Hi 8 H2) 


so that py = X; AP pP pm >  Xij; then, by applying (5.173), (5.170) and 
the positivity of the von Newnann entropy (see (5.165)), one gets 


S (p12) — S (pi) = 3 Ni(S (of? 8 P) - 5 (0;”)) 
= -Fash D) >o > 


3. The so-called GHZ states are entangled pure states of tripartite systems consist- 
000) +]|111 
ing of 3 qubits : | P+ ) := | al in the computational basis. Though 


entangled as tripartite states, all their two qubit marginal states are separable, for 
instance 


1 


i = Jame O nM 1 ii!) =} 5 MEL @ LMA. 


i,j=0 25 


From | ©, ) one obtains an ONB in H®) = (C?)8? by acting locally with the 
Pauli matrices, 


| Yab) = 0% @ 0? @o$| P+), a,b,c=0,1 
1 


1 l l 
(Vaes | Wabe) = 5 D (logt LA) (iloft lj) (ilog lj) 
i,j=0 as: 
ij 


1 
1 ; 
= 5 OD (iloft li) (iloft li) = Sadðveðes . 
i=0 
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5.6 Dynamics and State-Transformations 


The standard time-evolution of quantum systems is typically described by a strongly 
continuous one-parameter family of unitary operators {U;};eR on a Hilbert space H, 
fulfilling the group composition law U Us = U;+;5, for all t, s € R. By Stone’s theo- 
rem [353], the group is generated by a self-adjoint operator H on H, the Hamiltonian, 
such that, for all y € H, (A = 1) 


lee) = i Hlp), lve) =U), Uae, (5.177) 


This type of time-evolution equation is proper to the so-called closed quantum sys- 
tems. As any other physical system, also quantum systems S are in contact with the 
environment E which contains them; however their mutual interactions are negligible 
and the dynamics of S is independent of E and is reversible. When the interactions 
between S and E cannot be neglected, it may nevertheless be possible to derive a 
closed dynamics for the system S alone which nevertheless accounts for the pres- 
ence of the environment. In such cases, § is known as an open quantum system and 
its so-called reduced dynamics is irreversible and incorporates noisy and dissipative 
effects due to the presence of E. 

The Schrödinger time-evolution easily extends from state vectors to mixtures. 
Since pure states | Y )( w | evolve into pure states U;| w){ w IU}, extension to convex 
combinations of pure states yields the Liouville equation 


pı = =i |H, p|. (5.178) 


with formal solution p; = U; p U;* for any initial state p € Bi (H). By duality (com- 
pare (2.9) and (2.7)), if X € B(H) then Tr(p; X) = Tr(p X+) and one gets the Heisen- 
berg time-evolution equation for the operators 


OX, =i |a; xı | . (5.179) 


This gives rise to a one parameter family {2/;},<R of automorphisms of BCH), 
XP UX] := X, =U; XU,, (5.180) 


that preserve hermiticity and products of operators, 


UX] = UX], U[XY]=U,[XIU[Y] V X,Y c BH). 


As automorphisms, these linear maps are positive, and also completely positive 
as their action is of the Kraus-type discussed in Proposition 5.2.4. By duality the 
action of U; is transferred to the action of the dual Uy; on Bi (H): U; [e] = U: pU As 
U; preserves the trace, Tr(;[p]) = 1, and sends projectors into projectors, 


P? = P= P? => UF (Pl)? = UTP] = U [PI . 
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A state p is an equilibrium state if and only if it commutes with the Hamiltonian that 
generates the dynamics, U; (p) = p = [H , p] = 0. However, if a state p changes 
in the course of time under a time-evolution implemented by unitary operators, its 
spectrum does not. As a consequence, as much as the Gibbs entropy of classical prob- 
ability distributions evolving under a Hamiltonian flux on phase-space is a constant 
of the motion, the von Neumann entropy is always preserved by the Schrödinger- 
Liouville time-evolution, S (ha [e]) = S(p). 


Examples 5.6.1 


1. We have seen that density matrices p of 2-level systems are identified by their 
Bloch vectors p € RÌ. By denoting them as kets | p), the linear action of the 
commutator on the right hand side of (5.178) corresponds to a 3 x 3 matrix 
acting on | 9), whence the Liouville equation can be recast in the form 0;| p) = 
—2H | p). Since[1 , p] = 0, itis no restriction to take the Hamiltonian of the form 
H = w -o with w = (w1, w2, w3) € R?, o = (01, 02, 03). Then, the algebraic 
relations (5.58) yield 


do: 3 0 w3 —w2 
I 

Z: Eijkw jpk => H = —w3 0 w 

j,k=1 0 w =w 


Thus, Bloch vectors rotate with angular velocity w = ||w|| around the direction 
of w; their lengths are then constant, pure states remain pure and the surface of 
the Bloch sphere is mapped into itself. 

Suppose w = (0, 0, w), then, by series expansion, 


U, =e! = coswt + io3sinut ; 
thus, 03(t) := U} o3 U; = 03, while 


o4(t) := U} o4 Ur = 04 (cos 2wt + i sin2wt) = o4 e% . 


The same result more directly follows from the fact that 


oko, E a 0403 + 204 = 04 (03 + 2) 
and that this relation extends to functions f (03) that can be expanded as power 


series, namely f(o3)o04+ = o+ f (03 + 2). 
2. Consider an array of N spins 1/2 equipped with the Hamiltonian [353] 


N- 


N 
m= Y Buo + > 


1 N 
eli) ol oit = (H; + HY"), 
j=l i=1 


j=l 
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where, as in Example 5.4.2, øf denotes the Pauli matrix g3 at site j and products 
of Pauli matrices at different sites denote tensor products of commuting operators. 
The single sum corresponds to the spins being coupled to a vertical constant mag- 
netic field B, while the double sum describes spin-spin interactions whose range 
is regulated by the coupling constants e(i) which only depend on the distance 
between spins. 

Let the array be provided with periodic boundary conditions a = aa (Og = 
03, o+); then, the j-th spin interacts with the same strength with those symmet- 
rically placed on its left and right hand side. Suppose N odd, it follows that H m 
can be recast as 


(W-1)/2 
Hw" = > eli (oioi + alatti); 


i=l 


Using the previous example and the fact that o% commutes with all spin operators 
from sites different from k, one obtains 


of (t) = e AN gl ein — gj (5.181) 


. š . F . . int $ _ : int 
ot (t) = eit HN A e tH = e (H+H; ole it(Hj +H; ) 


_ of e2it(Bujt rye) I e00 +i) 
N-1)/2 
=g} eee) I] ( (cos 2re(i) + of! sin 2e(i)) x 
t=1 
x (cos 2te(i) + oft sin 2«(i))) (5.182) 


while of, (t) is obtained by taking the adjoint of of. (t). Let the spin system be 
endowed with a state which is the tensor product of equal pure states as in (5.110) 
each of them for each one of the sites 


N 
p2% = Re mee l+s l V1 —s2e? 
2 0 2\yl-=se? 1-s 


This state is not invariant under the time-automorphism in (5.181) and (5.182): 
indeed, p®% (c/(t)) = s for all j, but 


=D 7 
pno) = "84 poly T] ( (cos 2re(i) + poi’) sin 2¢(i)) x 
i=l 
x (cos 2te(i) + plait’) sin 2e(i))) 


v1 


2itB -s 2 
=e H — — ING), 
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(N-1)/2 


fu.t):= JI (cos 2e(i)1 tis sin 2e(i)r) . 


i=1 


If the coupling constants decrease exponentially with the spin distance, e(i) = 
2-“+) and s = 0, then 


(N=1)/2 


fv(0, t) = I cos $ 


J 


and the observables c4 show a recurrence time that increases as 2070/2, 


PEN (ol (t)) = tBu; 5 FR, t). (5.183) 


3. Consider f uncoupled harmonic oscillators of masses m = 1 and frequencies w; 
described by the algebra of Weyl operators (5.77) W (r), r = (q, p) € R* and 
by the Hamiltonian operator 


Using (5.70), the Heisenberg equations of motion (5.179) for position and momen- 
tum operators F = (q, pP) read 

dq =. dp 2> 

=p, — =-29, 
dt P dt 


where (27 is the diagonal f x f matrix 27 = diag (wy, we, was w4). They are 


solved by 


= ~ HHA —i ps cos 2t RQ! sin 2t 
T: := Ur = ne = Af, A= a sin 2t cos 2t ) 


Because of linearity, it turns out that the time-evolution maps Weyl operators into 
Weyl operators, 


W(r) = eir (Zr) o UWr] =: W, (r) = eir (Zr) — eilr) (£1) 
= W(r;), (5.184) 


where r; solves the Hamilton equations for f classical harmonic oscillators. 
Using the notation (5.97), one passes to annihilation and creation operators via 


the relations 
he 1 fol? 7o-12\.. 
= V2 Qi —i Q-1/2]" > 
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ae 1 f + 
The Hamiltonian then reads H = z 5 wi a; aj, SO that 


i=l 
a; (t) = Ula] =a e, afe) = Uylaf] = af i“ 


4. Example 5.4.3 provides the right algebraic setting for quantizing classical dynami- 
cal systems as those studied in Example 2.1.3. As much as in that case, the quantum 
time-evolution will be described by a one-parameter group {O} }rez consisting 
of integer powers of an automorphism Oy : My(C) => My (©). 

LetA = (: J of Example 2.1.3 be an evolution matrix with integer components 
and determinant equal to one. According to (2.22), the time-evolution of the 
exponential functions reads 


(Uhen) (r) = 2mm Ar) — emi Anr ey, AT = (; 3) 


If one identifies em with Wo(m), then the discrete Wey] relations (5.87) can be 
read as a non-commutative deformation of the fact that the exponential functions 
commute: 


Wo(n) Wo(m) = Wo(n + m) = Wo(m) Wo(n) . (5.185) 


It is thus natural to define the automorphism as 
Ow [Wy (a)] = Wy (An) , (5.186) 


and extend it linearly to the whole of My (©). 
In order to be an automorphism, © has to fulfil ©y [1px]; then, from (5.86) and 
(5.88), 


Oy[UN] = ei% = Wy (NA (1, 0)) = Wy(N(a, b)) 
= e2" i(aay+bay— Fab) 
On[Vy] = ei = Wy (NAT (0, 1)) = Wy (N (c, d)) 


= e2" i(coytday— cd) : 


It thus follows that a discrete representation Me? of the CCR has to be chosen 


such that 
ab\ fau\ _ (ou N fab 
(ea) (a) = (a) + F (ea) met 


11 
12 
choice &u,v = N /2 mod | yields a finite-dimensional quantization of the Arnold 
Cat Map [119,163]. 


For instance, when in Example 2.1.3 a = —1, then A = ( = A! and the 
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Furthermore, the automorphism ©, is implemented by a unitary operator Sy : 
C% > CN which can be determined by means of the Eq. (5.91) once it is repre- 
sented with respect to the chosen basis {| j Eo: 


N-1 
(kISwl€) = So Ste.pg (a | Sn Ip) 
p.q=0 
1 
Sre.pa = 5p 2 (PIWn(=m) |g) (k| Wun) |e) . 
neZ, 


5.6.1 Thermal States 


In the following, we shall focus upon states of quantum systems that are left invariant 
by the dynamics, namely we shall be interested in equilibrium states. A well-known 
class of such states is represented by the thermal or Gibbs states: 

e78 H 


Zg 


p3 := _— te ) , (5.187) 


at inverse temperature T7! = £ relative to a Hamiltonian H. Let us consider a finite 
level system and the two-point time-correlation functions 


Fyy(t) = Tr( ps4 1X] Y) , Goes Tr(p9 ¥ULX1) (5.188) 


for all X, Y € My(C), where the dynamical maps U, t € R, are generated by 
(5.179). Simple manipulations based on the cyclicity of the trace show that 


Fxy (f) = Te(Y paUulX1) = Te(p8 ¥ ppl X1 95") 
= Tr(p9 Yel H @+ib) y ei np) =Gyy(t+iB). (5.189) 
The above equality expresses the Kubo-Marting-Schwinger (KMS ) conditions in 
their simplest form [217,243] and Gibbs states as in (5.187) are the simplest instances 


of KMS states. 


Remarks 5.6.1 


1. Only the Gibbs state pg € My (C) can have two-point correlation functions sat- 
isfying (5.189); in fact, 


Tr(p Xx Y) = Tr(p ¥ Uist) = Tr(UisLX] p Y) 
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for all Y € My(C) yields 
px =e FH X BH p = [t 2, x] =0 


for all X € My(C) whence p œ e7 H, Jf the KMS condition are taken as a 
signature of thermal equilibrium, the conclusion to be drawn from this example is 
that for finite degrees of freedom there can be only one equilibrium state at a given 
temperature. Therefore, in order to mathematically describe phase-transitions one 
needs infinitely many degrees of freedom [353]. 

. At infinite temperature 3 = 0 and the Gibbs states reduce to a tracial state (see 
Example 5.3.3.3): 7(X) = Tr( X). 

. We have seen that faithful density matrices p > 0 are naturally associated with 
the modular operator (5.161). The modular operator defines the modular auto- 
morphisms of : ™(My(C)) > (My (C)), t € R, given by 


of it )(X)] = Al (XAT =p @ pn (X)p''@p"'. (5.190) 


They from a group, the modular group, and preserve the GNS state, of | JP) = 
| P), so that (5.162) reads 


a? ia TXI VP) = JIpm)(X)"|./p) = |./pX) . 


Further, it turns out that p is a KMS state at inverse temperature 3 = 1 with respect 
p 
toa_,, 


(Jp 102 COY) IVP) = Tep X pit Y) 


Z Tr(p¥ pit? xpt) = (VP ITY. gpp lT IVP) - 


By means of the modular group, when a faithful p is decomposed into a linear 
convex combination of other density matrices o j, p = > j Aja j, its decomposers 
in (5.156) can be recast as follows, 


Ajoj(X) = (,/p| T p(X) |./pX j )= (/p | TX) al salt (Xj)I IVP) 
= (Spl oP alm p(X TX) IVP) - (5.191) 


The two-point correlation functions Fyy(t), respectively Gyy(t) can be ana- 


lytically extended to the strip {t +iy:—68 < y < 0}, respectively to the strip 
{t +iy : 0 < y < 8} where they are continuous and bounded, including the bound- 
aries where they satisfy the KMS conditions (5.189). When the Gibbs state (5.187) 


is a density matrix, pg € 


Bi (H), these properties which are almost obvious in the 


case of finite-level systems, can be extended to systems with an infinite dimensional 
Hilbert space H [129,353]. 
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Examples 5.6.2 


1. 


Spin 1/2: The density matrix in Example 5.5.5 corresponds to Gibbs states with 
Hamiltonian H = w g3 and temperature 3—! such that 


= I-s 0 -| uo 
P=2\ 0 1+s ~ 2cosh Bw , 


with s = tanh Gw. Therefore, the modular group consists of 


At = pe Q pt = eit Gwos Q elf pwo3 = eit w(o3@ll2— 112803) 
p : 


. Fermions: Let N Fermionic modes, described by creation and annihilation oper- 


ators af, i = 1,2,..., N, satisfying the CAR {a;i , a‘) = 6;;, be equipped with 
a Hamiltonian operator 


N 
H= Yi a) ai á 

i=l 
This can be regarded as the second quantization of an N-level, one-particle Hamil- 
tonianh = yi cili) (i| € My (C) and aj as the creation operator of a Fermion 
in the eigenstate | i ) with energy ¢;. 
The partition function Zg is easily calculated since for each mode the occupation 
number states are | 0) and | 1) (see Example 5.4.2): 


N 


N 1 
Zo = t(s] = I] 5 e Pim = TC + gta ; 


i=l n;=0 i=l 


whence the Gibbs state of N non-interacting Fermions read 


In thermodynamics, Gibbs states are known as canonical equilibrium states, PG» 
while gran-canonical states have the form 


N 
-1 , 
o$ =[] (1 +e BE: W) eB HEN) 
i=l 


where ju is the chemical potential and N= X; aj ai is the number operator. 
Two-point expectations read 


GC taane o Z 
Tr(p3~ 4; aj) = Ôij eae’ (5.192) 
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where z = e/” is the so called fugacity. As regards higher order correlation func- 
tions, by means of the CCR anyone of them can be reduced to sums of expectations 
with equal numbers of annihilation and creation operators matching in pairs: 


N 
Tr(p3° aj, . a}, ajaj) = 5 pqDer( [Trp a} ajd), sa) . (5.193) 


By suitably shifting the one-particle Hamiltonian, one may always assume the 
lowest eigenvalue (ground state energy) to be 0, whence 


Z 


0< Tr( 09° al ai) = 


for the CAR imply ||a‘a|| < 1, so that z > 0. 
3. Bosons: Let a*,i = 1,2,..., N, satisfy the CCR [a; , a 
N Bosonic modes with a second-quantized Hamiltonian 


i = ĝ;j of a system of 


N 
H = cial ai, E&i > 0. 
L 
i=l 


The partition function reads 


N œ N 


Zg = ame?) = Il ys eT Feim — TC = ery , 


i=l n;=0 i=1 
and the canonical and gran-canonical equilibrium states have the form 


N N 


C —Be;\ 2—8 H GC _ —B(e;- —B(H—uN 
o =] -e Je » PB =[](1-e ic Me EHN) 


i=l i=l 
The Bose two-point correlation functions read 


Z 
Tr(p© aj aj) = Oi; x 


z. (5.194) 
PEi — Zz 


while 2N-point ones are of the form 
T GC „Ý jj —=6, P T GC 7 N: 1 
(09 a} a} ajaj) = dpqgPer( [Tro a}, ajo) ce)? 695) 


where, unlike for Fermions, 2N -point correlation functions do not assign dif- 
ferent signs to different permutations whence a permanent appears instead of a 
determinant. Furthermore, with the ground state energy set equal to 0, 


Tr(p$° a} a1) = pees z<l. 
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The fact that when z — 1, the ground level can be infinitely populated is the 
source of the phenomenon of Bose-Einstein condensation. Canonical and gran- 
canonical N Bose states as p°© are Gaussian (see Example 5.5.2); indeed, the 
characteristic functions (5.123) of PS equals 


N 
f 1 T * Ý 
Tr(p9 wa) = TG — em) Tr(e Pa 4; fi gti 2; a) ; 


i= 


In order to compute the trace, it is convenient to split W(z) as in (5.79), and to 
use the overcomplete basis of coherent states (5.114) expressed as in (5.111); this 
yields 


The a} aj eaaa) = A ae es a 


iaj dw; ie Rowan EE 
Sy \zil al (wi | e%i4ie BEG; ai o za; |w; ) 
TE 


A dwi p.12 ai nenea Jat 
zg kil ej i eT lwil (vac | ei +7) o BEiG; Gig (z*—wj)a; lvac) . 
T 
Taking into account that (see (5.102)), for any a € C, 


i © ak 
Psala ge aca cY C H, [H, [Hal] = ea 
k=0 ` 


eS)» - 
k times 
eat =weal 
ewea date acéa'a =e% qi 
and that (5.78) yields 
eleli — e92 gaatyal — e9 eT era , aéeC, 


one obtains, by Gaussian integration, 


f dw; eo lwil? lodei eCitw aie peia] ai o (2¥—wi)a} lvac) = 
T 


e7lzi [e725 (l-e Fi yl 


L [PE ow -iune -we 
T 1 — e7 6zi 


Therefore, one derives the form of the correlation matrix (5.129) from 


N 
1 2 pEi 
Tr(p9 w(z)) = exp(—5 > |zi| coth =) 
i= 


1 x, [coth Eh 0 z 
= —-(z,— ; 5.196 
SR ( 4 ae ( 0 coth ch =z ( ) 


where h = ee ej|i)(i| (e1 = 0) has been used. 


5.6 Dynamics and State-Transformations 253 


4. The relation (5.89) in Example 5.4.3, allows to equip the quantized hyperbolic 
automorphisms of the torus T? with the @y-invariant state wy defined by 


1 
wn (Wy (n)) = Tr(Wx (0)) = ond (5.197) 


on the Weyl operators and extended by linearity to their linear span, where it 
amounts to the normalized trace. 


5.6.2 Quantum Operations 


A major departure from classical mechanics is represented by the role played in 
quantum mechanics by the measurement processes where a microscopic system, 
S, on which the measurement is performed, interacts with a (usually) macroscopic 
system, E, the measuring apparatus. 

In a classical, commutative context it is always possible, at least in line of prin- 
ciple, to make negligible the effects on S due to its interaction with E; instead, in 
quantum mechanics states are generically unavoidably perturbed when undergoing 
a measurement process. The standard way the quantum mechanical perturbations 
are taken into account is via the so-called wave-packet reduction postulate; in its 
simplest formulation it goes as follows. Let X = X* be an observable with discrete, 
finite and non-degenerate spectrum, say X = as 1 Xj Pj, Pj = |Y; X Yj |. Upon 
measuring X on a system S, the outcomes are the eigenvalues x ;; the measurement 
process can be schematized as follows: a beam of copies of a same system S, all 
prepared so as to be described by a same state p, are sent through an apparatus that 
measures the eigenvalues x; leaving the system state in the corresponding eigen- 
projections P; and direct them towards a screen with d slits. By opening the jth 
slit, the others being kept closed, only those systems on which the eigenvalue x ; has 
been measured are collected. Suppose N; of the N systems that interacted with the 
apparatus reach the screen through the j-th slit; then, the ratio N;/N approximates 
the quantity 


pi := Tr(p Pj) = (vj | ply; ) 


when N becomes sufficiently large. If no selection is operated, that is if all the d 
slits are left open, after sufficiently many repetitions of the experiment with the same 
state preparation, the collected mixture of systems is described by the projections 
P; weighted with the corresponding mean values pi . Thus, a typical non-selective 
measurement process changes the state as follows: 


d d d 
pr >> PP = Dold vj lelys Ne l= >> PoP 6198) 
j=l j=l j=l —_ 
Fpl[pl 
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The map F? is linear on the state-space S (S) and transforms states into states: indeed, 
Fp[p] => 0 and Tr(Fp[p]) = Tr(p) = 1, as one can check by using the cyclicity of 
the trace and the fact that the P;’s constitute a resolution of the identity, ar P;= 1. 

In general, the emineo? change from p into Fp[p] transforms pure states 
into mixtures and may intuitively be associated with the loss of information due 
to the interaction with the many degrees of freedom of the macroscopic measuring 
apparatus. As a consequence, contrary to classical mechanics, quantum mechanics 
distinguishes between two state-changes, a reversible one due to the Liouville time- 
evolution and an irreversible one, the wave-packet reduction, describing the action 
of measurement processes. 


Remark 5.6.2 Effectively, while being subjected to a measurement, any quantum 
micro-system is to be considered as an open quantum system dynamically and sta- 
tistically correlated with the (infinitely) many, degrees of freedom of the measure 
instrument. This many-body interaction is usually not controllable and the phe- 
nomenological description of its overall effects is via maps as in (5.198). In particular, 
a measurement process of an observable on a system whose state | Y% } is a coherent 
superposition of the observable eigenstates, | 4) = yy cj| Yj ), transforms it into 


a mixture p = pam Ic; \7| w; )(; |, with consequent loss of coherence. 

The existence of two basic quantum time-evolutions, one reversible typical of 
closed quantum systems, the other one irreversible and related to measurement pro- 
cesses, is unsatisfactory from an epistemological point of view. All the more so, since 
the irreversible macroscopic behavior of system plus apparatus should be deducible 
from the reversible dynamics of their constituent microsystems. Alongside with the 
problem of reconciling thermodynamical irreversibility with microscopic reversibil- 
ity, quantum mechanics raises the question of how to reconcile a reversible micro- 
scopic dynamics which preserves the purity of states with an irreversible macroscopic 
one which transforms pure states into mixtures. A number of approaches have been 
developed to attack this problem, for a thorough review of one of them which is based 
on a modification of microscopic dynamics by the insertion of a decoherent mecha- 
nism with negligible effects on microsystems, but substantial ones on macrosystems 
see [23]. 


It is convenient to extend the notion of wave-packet reduction to that of positive 
operator-valued measures (POVM). 

The key property of a map as in (5.198) is its structure and the use on the right 
and left of p of operators such that $- ‘ p? (= 2 ; P;) = 1. The generalization is 


quite natural. 


Definition 5.6.1 (Partitions of Unity) Let E; € B(H), j € J, be a selection of oper- 
ators such that Èj EVE; = Il: it is usually referred to as a POVM or a partition 
of unity. One associates to it the linear map Fe : S(S) + S(S), 


5.6 Dynamics and State-Transformations 255 


p= > Ej pe). (5.199) 
jeJ 


In Example 5.6.4, we shall discuss the interpretation of generic POVMs in relation 
to measurement processes; for the moment, it suffices to stress that the operators 
forming POVMs need neither be self-adjoint nor orthogonal projections. While the 
von Neumann entropy is constant under the Liouville time-evolution, instead, under 
a generic POVM, it can increase or decrease. 


Example 5.6.3 If p = Q is a pure state (one-dimensional projection), then its von 
Neumann entropy S(Q) = 0, while under the action of a wave-packet reduction, 
Fe[Q] gets mixed and S(F¢e[Q]) > 0. However, if one starts with a mixed p, 
S(F¢s[p]) can be smaller than S(p): take for instance E1 = |1)(0|, E2 =|1)(1], 
where |0}, | 1) are a basis in C?, then, E\E\ + EJE? = Il and, for all p € S(S), 


Fele] = |10] O1] +1111 (11= 1X1 1) Tr@) =11)1]. 


Thus, for any given mixed p, S(p) > S(Fs[p]) = 0. 


The use of non-projective POVMs is rather common; for instance, a non selective 


measurement of the intrinsic angular momentum Bog of a spin 1/2 particle along 
the 3rd axis yields the map 


p= Folol = Py” p PÈ + PO pPP, (5.200) 


where orra = (-1)/ P, j = 0, 1. Proceeding with a non-selective measurement 
along the Ist axis yields 


p Filol = PP p PO? + PO pp, (5.201) 


where o1 pe = (— 1)/ ae j = 0, 1, so that the cumulative effect of the two con- 
secutive measurement processes is described by the map 


1 
1 
pt>Fulol = F EyeEn, Ey i= PO PP, (5.202) 
i,j=0 


where the operators F;; are neither projectors nor Hermitean. 


Consider the statement 
measuring the (orthonormal) POVM P := {Pj} je; € BCH) on the system S$ 
in the state p corresponds to the irreversible map p> Fp[p] = Dies P;pP;. 
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This has an acceptable interpretation in physical terms for the orthogonality of the 
P;’s reduces the experimental measure of P to an experiment with #(J) slits. The 
same argument does not directly wore when projective POVMs P are substituted 
with generic E := {Ej}jes, ); jeJ E! E; = 1. Consider the statement 

measuring a generic POVME := HE, }jez € BCH) onthe r S in the state 
p corresponds to the irreversible map p +> Fe[p] = 2 je EjpE\. 

In order to give it a meaning, one has to specify what is measured and on which 
system; indeed, the non-orthogonality of the E;’s makes untenable the straightfor- 
ward interpretation accorded to projective measurements. An answer to the above 
question is given in terms of couplings to ancillas and partial tracing. 


Example 5.6.4 ((266]) Let E := {Ej} jez © B(H) be a POVM for a system S. Let 
R be an auxiliary system described by a Hilbert space K which provides an abstract 
quantum description of an instrument to which S is coupled during a measurement. 
A schematic description of a measurement process associated with € is as follows: 


1. there exist orthonormal bases, {| Y; )} € H and {|k )};>0 € K, with |0) corre- 
sponding to the ready-state of the measurement apparatus; 

2. there is a unitary time-evolution operator U, on H & K such that, at the end of 
the process, at time t = T say, for any initial y € H, one has 


Ur|v) @10) =) Ed) lj) = |Y). 


jEJ 


The unitary operator Ur is well-defined: indeed, the right hand side of the above 
equality can be taken ie a definition of Uy as a linear operator from H @ | 0) into 
H® K. Since a ej Ei Ej = lls, where Ils denotes the identity operator on H, 
scalar products of vectors in the subspace H @ | 0) are preserved and the isometry 
Ur can be extended to a unitary operator on H © K. Let the compound system S + R 
be in the state YW, according to the postulate of wave-packet reduction, by measuring 
the eigenprojectors ls ® Pk, Pk := |k)(k|,k € J, the outcoming (not-normalized) 
states 


Ws @ Pel YNY Ils @ Pe = (Ed vw vlEL) 8 Pe 
are obtained with probabilities 
TY) = (Y | Is @ Py |W) = (4% | E} Er ly). 


By disregarding the ’instrument” R, the overall effect of the entire process on the 
system S alone is as follows: 
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e by measuring the projections Ils ® Pk on S + R after the action of Ur on | 7) & 
|0), the state | 2) changes into the normalized states 


Erl YY E} 


(Y| By Ex lb) 
with probabilities mg (wv); 


e without selection, the overall effect is 


|e) := 


Wdblo Yomi ldy djl = Do Ey Yd EY = Felly). 


JEJ jeJ 


e The process described by (5.199) is obtained by linear extension of the action of 
Fe from projectors to mixtures of projectors. 


Remark 5.6.3 The previous one is an example of dilation of a CP map to a uni- 
tary evolution on a larger system from which the former is obtained by partial trac- 
ing [131]. In general, any POVME := {Ej} jez C BCH) can be dilated to a projective 
POVMP = {Pj} jej consisting of orthogonal projectors P; ona larger Hilbert space 
K [174]. For POVMs such that card(J) = d, the proof goes as follows: consider the 
Hilbert space Ke C H ® C’ linearly spanned by vectors of the form 


d 
IY je = J Ejly) Bli), 

j=1 
where | 7; ) € H and {| j Wee ,; is an ONB in the auxiliary Hilbert space C4. Let 
|@) € Hand set | Yg )e := Xi E;|¢) |j). The operators P; on Ke defined 
by 

Pil Y) = Ejl¥j) @ls) 

are orthogonal projections such that yy P; = Il on Kg and the projective POVM 
P = {P;}}_ © B(Keg) is such that 


d d 
Yo Pil Modee( Wel Pi = DEO) G1 EL BIAS 


j=l j=l 
whence (5.199) results by tracing over the auxiliary Hilbert pace C2. 
By rewriting the map in (5.199) as 


+ 
Ej pe; 


Felp] = >> Ajay, Ay = TH ETE,) > 0, oj = 5 


jeJ 


; (5.203) 
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one identifies the statistical weights associated with the POVM operators E; and 
thus the statistical ensemble {;, 7;} jy it generates. Although the indices j of the 
POVM do not refer to probabilities since the a; are not in general orthogonal to each 
other, the argument of Example 5.6.4 gives a sound sense to the statement 

Aj = Tr(p ERI is the probability of getting the index j as an outcome by 
measuring the POVM € = {Ej} jez on a quantum system in the state p. 


Examples 5.6.5 


1. Although generic POVMs describe the most general measurements performable 
on quantum systems, they cannot discriminate exactly between non-orthogonal 
state vectors (w |) £0,|w),|¢) € C2. Exact discrimination by a POVM € = 
{Ej} je, means that the index set J can be split into two non-intersecting subsets 
J = Jı U Jo, the probabilities of whose indices are such that 


(dil J ELEjldi) =1, (dol Do EP Ej ldo) =0, 


Jest jes} 
(dil D> EVE; li) =0, (dol J EVE; lo) =1. 
jeh jeh 


Notice that the second and third conditions yield 
Eily) =0 Yjeh, Ejly2)=0 Viet, 


so that with I1 2 := >> 


Jelp EVE; and ) i jes EVE; = ll one retrieves 


(dildo) = (dr |M y2) + (41| ly2) = 0. 


2. Given a statistical ensemble consisting of two density matrices p1,2 € Mg(C) with 
weights 0 < u < 1 and 1 — ps, the POVM € = {E}, E2} that better distinguishes 
them can be obtained by associating E TE 1 with p1, E} En with p2 and optimizing 
the success probability 

Psuce = u Tr(p1 E{E) +- u) Tr(E} E2 p2) : 
By means of the so-called Helstrom matrix , 


Aj (p1, p2) = upi — (L— u) p2, 


and using that E}E> =i- E| E Psuce can be recast as 


Psuce =1— u + Tr(A,.(p1, p2) EE) . 
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Let P+ be the orthogonal projections onto the positive parts of the Helstrom 
matrix (see Remark 5.2.3), namely A,,(p1, p2) = Ai (p1, p2) — Aj, (1, p2) and 


Ay, (P1; p2) P+ = Aj; (P1, p2), then from 


Pouce = 1— p + Tr(A} (p1, p2) ET E1) — Te(A;, (P1, p2) Ej E1 
one sees that it is maximized by choosing E TE 1 = Px so that 


Tr(A;, (p1; p2) Ej £1) =0, Te(AÑ (pi, p2) ET E1) = Tr(A;, (P1, p2)) . 


It follows that 


Pelee = 1— u + Tr(A$ (01, p2)) . 


Finally, using that 


Tr(Aj,(p1, P2)) = Te(A$ (p1; p2)) — TAZ (01, p2)) = 2-1, 
Aj (P1 PD) = Tr(Aji (pi, p2)) + Tr(Ajr (p1, p2)) » 


opt . $ 
one can express PPc as a function of the trace-norm of the Helstrom matrix: 


pori 1+ AF L pl 


suce T 
2 


5.6.3 Open Quantum Dynamics 


Despite the practical impossibility of describing the interaction between a micro- 
system and a macro-system during a measurement process, it is not without hope 
to try a dynamical derivation of the wave-packet reduction (5.199). The idea is that 
the latter is a time asymptotic effect of a many-body interaction whose time-scale 
is much shorter than the duration of the process. The phenomenological description 
of the process cannot be given in terms of an automorphism U/;: on one hand, U; 
is reversible, while the wave-packet reduction is not, on the other hand U, cannot 
transform pure into mixed states. 

A straightforward way to extend the quantum time-evolution beyond the reversible 
one generated by the unitary Liouville equation is to add some extra structure to 
(5.178). One observes that the commutator corresponds to a linear action on the 
state-space S(S), and that the generated dynamical maps U; satisfy the composition 
law U; o Us = Ut+s for all s, t € R. 

A sensible step is to modify (5.178) by adding to the commutator a linear term 
that breaks time-reversibility and generates a semi-group I}, t > 0, of linear maps 
obeying a forward-in-time composition law 


Tolk=M4s, s,t>0. (5.204) 
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Namely, one tries to approximate (5.178) with a time-evolution equation of the form 


apt) = Lule] + Dip]. (5.205) 


Formally, the semi-group of linear maps {J7}:>0, solutions of (5.205), is obtained 
by exponentiating the generator: 


pr p= NA, Nae’, L=Ly + D. (5.206) 


Not all linear maps D lead to physically consistent irreversible time-evolutions Iņ}; 
the following conditions result necessary: 


1. Tr(D[p]) = 0: since Tr([H , p;]) = 0, this implies trace-conservation 0,Tr(p;) = 
0; 

2. D[p]' = D[p]: this guarantees preservation of hermiticity; 

3. the positivity I[p] must be preserved at all times t > 0. 


While the first condition can be relaxed, for instance in the case of decaying sys- 
tems [12], which we shall not consider, the other two conditions are instead nec- 
essary to ensure that J; map density matrices into density matrices. However, 
positivity-preservation alone does not suffice for the full physical consistency of 
T;, the stronger property of complete positivity discussed in Sect. 5.2.2 turns out to 
be necessary. Despite its mathematical origin, this notion is deeply rooted in quan- 
tum physics. Its importance was firstly appreciated in the theory of open quantum 
systems [12,117,215]. 

Equations of the form (5.205) that lead to semi-groups of dynamical maps that 
break time-reversibility are usually derived when one thinks of S as a subsystem 
immersed in a large (infinite) reservoir, or heat bath, R. Practically, one deals with 
a situation similar to the one in Example 5.6.4 and uses a partial tracing technique: 
the system S is not closed, but coupled to a large system R. The system S + R is 
described by the tensor product Hilbert space H & K, its states psp are density 
matrices on such a space and evolve in time according to the unitary time-evolution 


ps+r > pser(t) = US lps+r] = User) ps+r U}, p> 
generated through (5.178) by a Hamiltonian of the form 
H = Hs + Hr+AH_, (5.207) 


where Hsp are the Hamiltonian operators describing the reversible time-evolutions 
of system and reservoir alone, while H; takes into account their interactions with A 
an adimensional coupling constant. 

The interaction Hamiltonian is such that there are practically no hopes to arrive at 
an explicit unitary time-evolution Us+ z(t). On the other hand, one is interested in the 
dynamics of the open system S alone. Furthermore, in many situations of physical 
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interest, one may reasonably assume that there are no statistical correlations between 
system and reservoir at £ = 0; namely, the initial state of the compound system can 
be taken of the factorized form ps+r = ps ® pr Of the initial condition. Then, by 
tracing over the environment degrees of freedom, one obtains a one-parameter family 
of dynamical maps 


ps > ps(t) := Trx(ps+r(@)) (5.208) 


on the state-space of S which is called reduced dynamics. 

Together with the fixed form of the initial condition, the elimination of the envi- 
ronment degrees of freedom by means of the partial trace Trg makes the evolution 
irreversible. The factorized initial condition does get entangled in the course of time, 
so that, in general, the family {ps(t)}:>o0 of states satisfies a highly complicated 
integro-differential evolution equation of the form 


t 
aps) = [ ds LMps(t —5)], (5.209) 


in which the linear operator L E on the state space S (S) of the system exhibits memory 
effects that account for the entanglement of the system with the reservoir from time 
t = Ototimet > 0. Before dealing with how one can eliminate the memory effects 
and get an evolution equation that generate a semi-group of maps on S(5S), it is 
convenient to examine (5.208) in some more detail. 

Let for sake of simplicity assume that the initial state of the reservoir is described 
by a density matrix be pr = }_; pj| Y? yF |, where pj > 0, >); pj = 1, and the 


Y? form an orthonormal basis in K. We use them to calculate the partial trace Trg: 
pst) = X. pj (WR | User (ob) ps (UF U ROE). 
ij 


Notice that the matrix elements provide operators 


VO = JP] (YF Usir O IYF): H>H, 


so that the reduced dynamics corresponds to maps 


ps > Alps] := È Vit) ps Vii) . (5.210) 
ij 


According to Proposition 5.2.4, the resulting dynamical maps are CPU with the V;; (t) 
as Kraus operators. However, they do not form a semi-group because of the memory 
effects built in the integro-differential equation they satisfy. Under the hypothesis of 
a very weak coupling between S and R (A << 1), asemi-group reduced dynamics is 
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obtained by performing suitable Markov approximations, the most straightforward 
being the substitution 


t +00 
[ ds LMps(t — s)] > Lips(t)] := ( f ds rà) [ps(t)] - 


Since the memory effects have been eliminated, L is a generator corresponding to a 
Liouville equation or master equation as in (5.206). 

In the so-called weak-coupling limit [127], the Markov approximation sketched 
above can be understood as follows: an expansion to second order in the small 
coupling constant A shows that (5.209) becomes 


t 
Boto = tay? i; bs 4+ xf ds D(s)[ps(t — )], 


where D[-] acts linearly on the state space S(S). The effects due to the presence of 
the reservoir are thus visible on a time-scale 7 = tÀ? which is slow as \ < 1; by 
rescaling the evolution equation reads 


tr? 
dnb ID ae Fi, eS. f iDoue =. 
0 


Then, by letting A —> 0 one replaces the upper integration limit by +00 and neglects 
s in comparison with 7\~? in the argument of the state appearing in the integral. The 
problem with too naive Markovian approximations as this one is that very rarely they 
lead to irreversible evolutions that are positivity preserving [127]: most derivations 
provide time-evolutions that are not positive and generate physically inconsistent 
negative probabilities. For instance, the wild oscillations due to the system Hamilto- 
nian term À“? Hs when À —> 0 makes intuitively plausible using an ergodic average 
to smooth away too fast effects [116]. 


5.6.3.1 Irreversible Dynamics Within the Bloch Sphere 
With respect to Example 5.6.1.1, it proves convenient to represent density matri- 
ces p E€ M2(C) by Bloch vectors with one more component corresponding to the 
coefficient of op in the expansion (5.110). 

We shall identify p as a 4-dimensional ket Rf» | p) := (1, p1, p2, p3). AS a con- 
sequence, the linear action of the generator L : p +> L[p] in (5.206) corresponds to a 
4 x 4 matrix £ = [L] acting on | p ). The Liouville equation (5.178) thus becomes 


with —2H and —2D 4 x 4 matrices corresponding to the commutator Ly and the 
added term D in (5.205) (—2 has been inserted for convenience). Concerning the 
matrix D, the request of trace and hermiticity preservation imposes Do; = 0, j = 
1, 2, 3, and D,,, € R. By splitting D into the sum of a symmetric and antisymmetric 
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matrix, the latter corresponds to a Hamiltonian contribution that can be incorporated 
into H. Thus, one remains with a purely dissipative matrix 


0000 
uabec 
D= vbag |’ (5.211) 


wepy 


with 9 real parameters which depends on the phenomenology of the system- 
environment interaction and can, in line of principle, be tested in dedicated experi- 
ments [42]. 

By exponentiating £, one gets a one-parameter semi-group of 4 x 4 matrices, 
{Gr}:>0, such that G; = e7% (H+D) which corresponds to the semi-group {J}};>0 on 
the state-space S(S) given by 


3 


pt p(t) = mlp] = 5 Pult) ou > Pult) = (Gir) : 
p=0 


Since the trace is preserved at all times, checking positivity preservation amounts to 
checking whether Det[p(t)] > 0 for all t > 0 and for all initial p. 

Since the contributions of the anti-symmetric H cancel out, the time-derivative 
of the determinant reads 


3 3 
l dDet[p(t)] 
bio =P |_p=2( Do Puar + X Pori): 
i,j=l1 j=l 
In+n-o 
Let p be a pure state P (n) := a then Det[ P(m)] = 0. Therefore, T; [P (n)] 


> Oasks for D[P (n)] > 0 and the same must also be true for the orthogonal projector 
P(—n). By summing D[P(n)] > 0 and D[P(—n)] > 0 and varying n in the unit 
sphere, positivity is preserved only if 


abc 
DO = |bapß| >0. (5.212) 
chy 


The positivity of DO? is necessary for positivity preservation, but not sufficient, the 
reason being that D[P(n)] < 0 can follow because of the extra term aye Djop;- 
However, it becomes also sufficient when we ask that J} increase the von Neumann 
entropy of any initial state, as this is equivalent to u = v = w = 0 in D. Indeed, 
given any initial p, let it be spectralized as 


l+n-o l-n-o 
p=; + 


2 2 
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withO<m2<Ilane R? and he n? = 1. Then, one explicitly computes 


dp(t 
= — n In p) 
t=0 dt lr=0 


ri 
= 2| -rdin |DO n) + Djon} m=. 
j=l 


ds (p(t)) 


S= 


If Djo = 0 for j = 1, 2, 3, then the time-derivative is positive because of the posi- 
tivity of D and due to the fact that (r1 — r2) ln rı /r2 > 0; if not, one can always find 
a p for which S(p) < 0: it suffices to choose rı — rz sufficiently small and adjust n 
to make negative the second term in the previous expression. 


Example 5.6.6 Let us consider the following simple master equation for a 2-level 
system S, 
Op(t) 1 
ð 2 
Using (5.58), the Pauli matrices behave as eigen-vectors of the generator: L| Il] = 
L{o,] = L[o3] = 0, while L[o2] = —2 o2. Therefore, by simple integration of 
ruloj] = Liyloj] = y[Lloj]], they are also eigen-vectors of the generated semi- 
group y, = exp(t L): 


(01 p01 — 02p02 + 03p03 — p). 


vw =1, yilo =o, ulo =e”, yla] = 03. 


Hence, using the Bloch representation in Example 5.5.1.1, initial density matrices 
evolve in time according to 


1 = 
yle] = 5(1 + pioi +e” p202 + p303) ; 


000 
Since the matrix in (5.212) is now D® =1010 , the necessary condition for 
000 
positivity preservation is satisfied. Further, the Bloch vector at time t, p(t), is such 
that || e(t)|| < || ell; therefore, any initial density matrix remains a density matrix. 
However, suppose the 2-level S system evolving under ~y is statistically coupled to 
another 2-level system S’ that has no evolution of its own. Then, one has to consider 
states of the composite system S’ + S that evolve in time under the semi-group of 
maps of the form I; = id2 ® 7;, that is 7; lifts the action of y, from M2(C) to 
M2(M2(C)) = M2(C) & M2(C) as in Sect. 5.2.2. 
Among the possible initial conditions for I; there is the Bell state | Wo) in (5.174); 
we know from (5.17) that the corresponding projector Py is proportional to the 
matrix E = [Eij;] E€ M2(M2(C)) whose entries are matrix units in M2 (C). By writing 
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Ey, = (1 + 03)/2, E12 = (01 + i02)/2 and E22 = (1 — o3)/2 in terms of the Pauli 
matrices, it turns out that 


1 
PR = 7(1@1 +01 801 -02802 +0380). 


Then, setting A; := exp(—2r), under the time-evolution I}, P2 evolves into 


1 
2] = 3h18 1+0 801 -A028 02 +03 803) 


2 0 0 14+ 

o1 l +o oi +ir;02 o1 0 0 li- 0 

if )=3 0 1-r 0 G i 
1+, 0 0 2 


~ Alo, io l-o 4 


which is not positive definite for any ¢ > 0, for it always shows a negative eigenvalue 


(Ar — 1)/4. 


The physical meaning of the previous example is that, though +; is a meaningful 
time-evolution for one 2-level system, T; is not so for two 2-level system as there 
exists a state of the two together which does not remain positive definite in the course 
of time. Notice that the state which exposes the problem is entangled; indeed, any 
separable state, as in Definition 5.5.4 would remain positive under I}: as y;[p] > 0 
for all p € S(S), 


T| S Aue, ® pj | = >> Axo; 8 Hlejl=0- 
ij ij 


The importance of Theorem 5.2.1 is now apparent: physical transformations of an 
N-level system S cannot be described by linear maps A that are only positivity 
preserving, they must also be completely positive. Otherwise, by coupling S with 
another N-level system S’, one would obtain a map idy ® A which would map the 
initial entangled state PN into a non-positive definite matrix. 

The standard quantum time-evolution M, is automatically in Kraus form, thus 
completely positive and free from inconsistencies with respect to statistical couplings 
to ancillas. It is only when performing a Markovian approximation that one must 
check that complete positivity be guaranteed by the procedure [116, 117,127,337]. 


5.6.4 Quantum Dynamical Semigroups 


Positivity and complete positivity depend on the dissipative term D[p] added to the 
commutator in (5.205): it turns out that when one asks that the generated semi-group 
consist of completely positive maps, then the form of the generator is completely 
fixed by two theorems, one by A. Gorini, A. Kossakowski and Sudarshan [154] and 
one by G. Lindblad [229] that established its canonical GKSL form. 
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Theorem 5.6.1 (GKSL-Theorem) Let {yi }>0 be a one-parameter semi-group 
of hermiticity preserving, unital linear maps y; : Ma(C) +> Mq(C) such that 
lim;_.9 y% = idg with respect to the norm-topology. Then, 


1. the semi-group has the form y} = exp(tL*) with generator 


L*[X] = ifH, x] + > Cal Fi X Fs 2 s(n. x}) . (5.213) 


where the matrices F; form an ONB in Mq(C) with respect to the Hilbert-Schmidt 
scalar product with Fp = Wq/Vd (Tr(Fa) = 0 if a # d* — 1)) and the (d* — 
1) x (d? — 1) matrix C := [Cap], called Kossakowski matrix, is Hermitean. 

2. The maps y; are completely positive if and only if [Cap] is a positive matrix. 


Proof From Example 5.2.7.1, the linear maps Fap : Ma(C) —> Ma(C) defined by 
Fabl X] := Fi X Fy,a, b = 1,2,..., d”, forman orthonormal basis in the d? dimen- 
sional linear space of all linear operators on Mg(C) equipped with the Hilbert- 
Schmidt scalar product of the associated Choi matrices. It follows that the gen- 
erator L can be expanded as Lt= y 1 Lab Fab. Then, the request that the 
generated semi-group preserve hermiticity implies that (L+[X])’ = L+[X*] for all 
X €e Mg(C) which in turn yields Li p = Loa. Now, after rewriting 


d?-1 a 
1 
LH[X]=FX + XF + Y Lap FÌ X F,, F= 52 La F}, 
a,b=1 d a=1 


and separating F into its Hermitean components, F = K + i H, where 


a a 
1 
K := 2 da baw FÌ + Ly», Fa), H:= zi Dba Fi — Ly, Fa), 


one concludes that 


d?-1 
L*[X]=i[H, X}4+ (KX +XK)+ J La FLX Fp. 
a,b=1 


The first statement of the theorem follows by further imposing unitality, that is 
that y[1] = 1 for all t > 0. One thus gets that L[1] = 0, which further imposes 
d*-1 
1 3 
K=-5 Stak) Fh. 
a,b=1 
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According to Theorem 5.2.1, zt is a CPU map on Ma(C) if and only if i = 
ida &® y7 isa positive, unital map on M4 (C) ® Ma (C). Notice that the maps T," form 
a norm-continuous semi-group with generator Lis := idg Q Lt; then, according 
to [211,212] (see also [80]), the maps idg ® y are positive if and only if 


1), $) = (YILDIN Ny) = 0 
for all orthogonal w, œ € C? @ C4. Since (7 |) = 0, it follows that 


d?—1 


16,9) = J Ca (C6 1a F} OXI la 8 Fe?) - 


a,b=1 


Then, it proves convenient to define the d* x d? matrices W = [Vij] and ® = [¢;;] 
where pij and ¢;; are the components of the vectors 7 and @ with respect to a fixed 
ONB {| i, j )}4 joi in C4 @ CÊ. Notice that (|) = Tr(W*®). By introducing the 


vectors |v) € Cc! with components given by 
va = ($| la ® Fal) = Tr(Fa(O")") 


d-1 
one then rewrites 1(,¢) = J` Cap vžvo = (v|C |v). If C = [Can] = 0, then 
a,b=1 
I(w, $) = 0 for all orthogonal Y, ¢ € C4 @ C4, whence J; is positive. 
Vice versa, given a generic vector |v) € cP- the traceless matrix M4(C) > 
y := pMa Va F, corresponds to a vector 7) € C? @ C? that is orthogonal to the 
non-normalized totally symmetric vector 4) = ye | ii). If p" is positive, then 
I, we) = (v|C|v) > 0 forall |v) € C2 —! whence C > 0. 


Remarks 5.6.4 


1. If {y }z>0 is a norm-continuous one-parameter semi-group of positive maps 7; : 
Ma(C) > Mg (C) with generator Lt and y, ¢ € C are orthogonal vectors, then 


O< (YIAN) = evi Ld) Olly) 


to first order in t. This yields the only if part of the theorem used in the previous 
proof; it turns out that this condition is also sufficient for the maps if to be 
positive [80,211,212]. 

2. The extension of Theorem 5.6.1 from Ma(C) to BCA) with H an infinite dimen- 
sional Hilbert space, has been provided by [229] under the assumption that 
ILEX] < ||L*|| IXI] for all X € BCH), namely that the generator Lt be 
bounded on B(H). 
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3. By duality, one gets the following time-evolution equation for the states of the 
open quantum system S: 


d?—1 


Lipl = -i[H, p] +> Cav( Fs p FÌ = s{Ain. p}). (5.214) 
a,b=1 


Since the 7; are unital, their dual maps CA preserve the trace of p. 
d?—1 
4. Ify is CP, the expression 5 Cab Fp p FÌ can be put in Kraus form as in (5.38). 
a,b=1 
Such a term corresponds to what in the classical Brownian motion is the diffusive 
effect due to the presence of a white-noise. It is indeed sometimes called quantum 
noise which is also in agreement with the effects of generic POVMs on quantum 
states [145]. 
5. Beside the noise contribution, the remaining part of the generator has the form 


2 
i(H kK) +i (H+ LK) col FÌ F, 
—i(H — = i ~K), = : 
ae RP 3 eel aera ee 


This expression corresponds to the typical phenomenological description of the 
time-evolution of decaying systems; in particular, K is a damping term due to 
probability that goes irreversibly from the system S to its decay products. 

6. Regarding the positivity of the generated maps + = exp(tL), though neces- 
sary and sufficient conditions are given by the positivity of the mean values 
(w|LU})(@{] |W) for all orthogonal state vectors |), |¢) € C4, apart from 
particular cases [212], no general results on the form of the Kossakowski matrix 
C = [C;;] can be extracted from them. For a similar problem, albeit in the more 
general context of non-Markovianity, see Sect. 5.6.6, Proposition 5.6.3. 


Semigroups consisting of completely positive maps are called quantum dynam- 
ical semi-groups. Their derivation as Markovian approximations of an underlying 
reversible many-body dynamics mainly follows three schemes, the already men- 
tioned weak-coupling limit, the singular-coupling limit [153,154,273] and the low 
density limit [126]. All of them work when the time scales of the system S and of 
the reservoir R are clearly distinguishable. The weak-coupling limit is the one most 
frequently encountered in the literature since the beginning of the theory of open 
quantum systems and also the one which, if not performed with due accuracy [117], 
leads to semi-groups of maps which are not completely positive and thus to physical 
inconsistencies in relation to entanglement. 
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Example 5.6.7 ((37]) Let d = 2; in such a case, by choosing the orthonormal 
basis of Pauli matrices Fj = 0; / J/2’s, j = 1,2, 3, the dissipative contribution to 
the semi-group generator reads 


Lolp] = >D Ciyfoipo; = 5| ji, p}| . 


i,j=0 


For sake of simplicity, we shall restrict to entropy-increasing semi-groups. Then, we 
can consider the matrix D in (5.211) whose entries read 


a=C77+C33, @= Ciu + C3, Y= Cu + C22 
b = -C1 , c = -C13 , 8 = -C23 . 


Thus, the positivity of [C;;], which, according to the previous theorem, is necessary 
and sufficient for the complete positivity of I, results in the necessary and sufficient 
inequalities for a, b, c, a, B, y: 


2R=at+y-aZ>0, RS> b 
25S=a+y-a>0, RTE 
2T =a+a-y=0, ST > 8 
RST > 2bcß + RO +S? +Tb?. 


These constraints are much stronger than those coming from positivity alone, that is 
from D®) > 0 in (5.212) which yields 


2 


a>0, a>0,y>0,aa>b,ay>c ,ay> 8? 


and DetD®) > 0. As a concrete example, take a = a and = b = c = 0; so that 
the GKSL-generator reads 


=y 
7 (03p03 — p) , 


y y 2a 
Lip] = 5 (01901 p) + z (92902 pP) + 


whence L[o1,2] = —a 01,2 and L[o3] = —y g3. It follows that, when a, y > 0, the 
generated semi-group 7; describes a decay process towards Poo = 1/2 with different 
rates for the diagonal and off-diagonal elements of y, [p]. 

Indeed, setting u, := exp(—yt) and A; := exp(—at), it turns out that 


1+ prp3 Apr — a 


1 1 
= -[ Í À ) = — A 
erat) ( + Ar(p101 + p202) + p103 2 m +ip2) l— mp3 


2 
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On the other hand, consider T; = id2 ® +; and P? as in Example 5.6.6, it turns out 
that 


1 
MP2] = =(1@ 1+ X01 801 -028 02) + pu 03 03) 


l+u 0 0 2% 
I| © t=% @ 0 
4) 0 O d= 0 

2At 0 0 1+ pr 


This matrix is positive definite if and only if 1 + ur > 2A;. This is implied by the 
complete positivity condition y < 2a, whereas if y > 2a, when t —> 0 one gets 


1+ py, — 2A, > tQa-—y) <0. 


In conclusion, only complete positivity guarantees full physical consistency with 
respect to statistical couplings with other systems. However, this imposes a hierarchy, 
y < 2a, upon the decay rates of the entries of the dissipatively evolving state [p], 
which should otherwise only be positive. 


5.6.5 Physical Operations and Positive Maps 


The argument at the roots of the request of complete positivity on state transforma- 
tions is that one can never exclude that the system S undergoing the transformation is 
indeed entangled with an ancilla system S’, even if the corresponding statistical cor- 
relations are neither detectable nor controllable. Though plausible, this point of view 
is not always accepted [278]; after all, the mere possibility of entanglement with an 
uncontrollable ancilla would then, via complete positivity, constrain the decay prop- 
erties. Consider, for instance, an actual experiments where optically active molecules 
interact weakly with a heat bath; they can effectively be described as 2-level systems. 
The relaxation to equilibrium of their optical activity can accordingly be predicted 
by an appropriate master equation. Clearly, the fact that the optical activity may 
depend on whether the molecules are entangled with some other system out of any 
experimental control sounds admittedly weird [91,220,221,344]. 

However, most of the objections to complete positivity do not consider the entan- 
glement issue for they all focus upon single open quantum systems in heat baths. If, 
however, two optically active molecules in a same environment are considered, the 
entanglement issue comes to the fore. If the two molecules do not interact, but are 
weakly and independently coupled to a same environment, it is sensible to describe 
their open dynamics by a semi-group of dynamical maps of the form I; = yj; ® Yr, 
where 7; is the reduced dynamics of a single molecule. These dynamical maps differ 
from idz ® 7; in Examples 5.6.6 and 5.6.7. 

Notice that in going from idg ® y, to T; = y, ® y, one effectively goes from 
the possible existence of statistical correlations between the system Sg and another 
system of the same type which is somewhat uncontrollable, to a concrete scenario 
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when one has two statistically coupled systems in a same environment. The following 
result on one hand extends Theorem 5.6.1 and on the other stresses once more the 
fact that complete positivity is not just a mathematical option without strong physical 
motivations, rather an unavoidable constraint to be satisfied by all fully physically 
consistent Markovian approximations. 


Proponifion 5.6.1 ([51]) Let {7 }>0 be anorm-continuous semi-group of dynam- 
ical maps y : Ma (©) > Ma(C) with generator as in Theorem 5.6.1. Then, the lin- 
ear maps T = y} ® y7 forma norm-continuous semi-group on Ma (C) ® Ma (C) 
and preserve positivity if and only if yj is a CPU map for all t > 0. 


Proof One implication is straightforward: if at is a CPU map, then yf ® idz and 
id, ® 7° are positive and such is I;*. 

For the other implication, notice that, in view of the assumptions, the one- 
parameter family {7;" }r>0 is a norm-continuous semi-group with generator Li = 
Lt ®idg +idy @ Lt. Then we argue as in the proof of the second part of 
Theorem 5.6.1 and show that 


Tb, p) = (YILDI ENY) = 
for all orthogonal Y, ¢ € C4 @C%. Since (4 | o) = 0, it follows that 


d?—1 
19 = $ Cab (C1 FF ® Ha 19) 6| Fs @ Wa bb) 


a,b=1 


+(d| la ® FÌ 1d)(6| la ® Feb). 


By means of the matrices YW = [y;;] and ® = [¢;;] associated to the vectors Y, @ € 
C1 @ C? as explained in the proof of Theorem 5.6.1, one introduces the vectors 


|w), |v) E€ C?-! with components given by 
= (6| Fp @ Waly) = THAW"), v := (| la 8 Fp |h) = Tr(F (0t W)"), 
and rewrites 
d?—1 


1,6) = D> Ca (ww + vey) =(w|Clw)+(vIClv) @). 


a,b=1 


Given |w) € cel , construct Mg(C) 3 W := ee 1 l Wwa Fa. Since a matrix ai 
its transposed are sags similar [162], let Y € Mg(C) be such that WT = Y W Y7 
and define ® := ¥—!, wt := Y W so that 


owi=w, (wo) =(yYWy7')’ =W and |w)=|v), 
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whence (x) becomes (Y, 6) = 2(w|C|w) > 0. Observe that |w) € C?-! is 
generic and that to any such vector one can associate orthogonal vectors %,  € 
C1 & C1 through the matrices W, ® as described above. Thus, / (w, $) > 0 for any 
such pair implies C := [Cap] => 0. 


Remarks 5.6.5 


1. If positive, p = GA ® ait is also CP; indeed, using (5.38), it turns out that 
ait [X] = ae Vio X V;(t), X € Ma(C). As a consequence, 
r XI = OV O@ViOXVjO @ VT). 
j,k 
2. The equivalence between the complete positivity of y;* and the positivity of T," = 
y7 ® 7;° does not extend to the tensor products of generic (7! LD), indeed, in 


Proposition 6.2.3 it will be shown that [+ = (y{)+ @ (9°) can be positive 
without (yf? )* being both CPU maps. 


Once a semi-group reduced dynamics is accepted as a phenomenological time- 
evolution under certain physical conditions as those compatible with, for instance, the 
weak-coupling limit scenario, there is only one possible way to get rid of the complete 
positivity constraint. One has to rely upon the existence of physical mechanisms 
that eliminate those initial entangled states that, like the symmetric projector Pi, 
would otherwise be cast out of the state of space by T, when 7; is not completely 
positive [149, 151,344,377]. 

In quantum information the situation is physically clearer and complete positivity 
compulsory. In fact, the state transformations that are commonly considered do not 
from dynamical semi-groups arising from suitable Markovian approximations, rather 
they are maps as in Definition 5.6.1. Indeed, the simplest operations are local state 
transformations that two parties operate on shared entangled states as ae In order 
to be physically consistent, these local operations must correspond to completely 
positive maps. 

What then of positive maps? If, on one hand, the existence of entanglement in 
nature forbids their use as dynamical maps that describe actually occurring phys- 
ical processes, on the other hand, as we shall see in the following chapter, they 
automatically provide so-called entanglement witnesses. 


5.6.6 Non-Markovianity 


The semi-group composition law (5.204) is a paradigmatic example of Markovian 
behaviour, that is of loss of memory of the past dynamics in the course of time. 


6 Notice that, as (Y |) = 0, the matrices PY Ý and W1@ are traceless. 
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The presence of memory effects is instead embodied in integro-differential master 
equations of the form in (5.209), 


t 
Ap = f ds Kr slos]. (5.215) 
0 


Ifthe dynamical maps yj; : p +> pr := yrlo],t = 0, generated by (5.215) are trace- 
preserving and algebraically invertible, namely if there exists y; l for all t > 0 such 
that y! oy = yo y! = id, then, by writing ps = Ys © y {Lor the previous mas- 
ter equation can be recast as 


t 
O; pr = Lilo, Lr:= f ds Ki s 0 Ys 0 yr! ; (5.216) 
0 


where L; is now an explicitly time-dependent generator acting as a linear operator on 
the time-evolving density matrices pr. This is what we shall assume in the following: 
for the cases when one has to deal with an integro-differential equation without being 
able to reduce it to a master equation, see [103]. Then, by means of an argument 
similar to the one used in the first part of the proof of Theorem 5.6.1, we know it has 
to be of the form (see 5.214) 


d?-1 
. 3 1 r 
Lilel=—i[H@®, p] + X CaO(Fpri - {FiF of), (6.217) 
a,b=1 
where both the unitary and the dissipative contributions can now explicitly depend on 
time, the latter through a time-dependent Kossakowski matrix C; = [Cap(t)]. Since 


generators at different times do not in general commute, the formal integration from 
O < s < t requires then a time-ordering 


t 
t,tolp] = T exp ( f ds Ls) [e] (5.218) 
to 


+O at S| Sk-1 
= 5 ası f asz: f dsk Ls o Ly +: o Lylol. (5.219) 
k=0 to to ti 


0 
The maps 7;,; compose as a two-parameter semigroup 
Yt,s O Ys,so = Vt,so VO<so<s<t. (5.220) 


These maps are known as intertwiners as they interpolate between the dynamics up 
to time f, 7, and that up to time s, ys: 


Vt = Vt,s O Ys > VOSs<t. (5.221) 
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Notice that if ys is invertible as a map then 


Yis =H N.. (5.222) 


Example 5.6.8 A simple, reduced dynamics forming a two-parameter semigroup is 
generated by the following time-dependent generator: 


1 tanh 


1 
Lile] = = (o1p01 — p) + 5 (02p02 — p) 5 


t 
5 (93p03—p), (5.223) 


where tanh t > 0 for t > 0, so that the diagonal Kossakowski matrix 


1 10 O 
C,:=- {01 0 (5.224) 
2 0 0 —tanht 


is positive semi-definite only at £ = 0. As in the case of Examples 5.6.6 and 5.6.7, 
the Pauli matrices are eigen-vectors of the generators 


1 — tanh t 


7 0o12, Lilo] = —2 03. (5.225) 


LM]=0, Lilo] = — 
In such a case generators at different times commute; thus, the action of the dynamical 
maps 7; follows by integrating 0,y;[0;] = L;[y:[o;]] from 0 to t: 


y=, yilo] = e! coshto,2, Y¥;[o3] = e7% 03. (5.226) 


Despite the Kossakowski matrix C; ž 0, the Choi-matrix (see Remark 5.2.6) is 
always positive semi-definite and thus the maps 7; completely positive for all t > 0. 
Indeed, setting T; = id2 ®@ 7, as in Example 5.6.7 one looks at the spectral properties 
of the 4 x 4 matrix 


1 
HP? = z(1@ 1A (018o — 02. B 02) + 1193 B93) 
l+p; 0 0 21 


. 1f O te O 0 
salo Tiaa d (5.227) 
2At 0 0 1+ Lt 
1 —2t 
where À, = =< u =e”. (5.228) 


Since t > 0, the eigenvalues of the internal 2 x 2 matrix are positive, while of those 
of the external one, e+ (t) := 1 + ur + 2;, e4 (t) > 0 and e_(t) = 0. Furthermore, 
(5.226) yields the action of the inverses of the maps Yr, 


2 


_ i 2 _ 
Vt 1N] = Il, Vt Hoa] = Ipe 012, % l[o3] =e l o3 R (5.229) 


5.6 Dynamics and State-Transformations 275 
and of the intertwiners Yrs = Yr o Y; E 


-2 
Le * ~2(t—s) 


fies O12, Y,slo3] = (5.230) 


YsHJ=1, %,slo1,2] = 
exactly as by integrating 0,y;[0;] = Lio y:loj;] from s tot > s > 0. 


The previous example shows that, in the non-Markovian case, the positivity of the 
Kossakowski matrix is not necessary for the generated one-parameter family {7;}+>0 
to consist of completely positive dynamical maps. Indeed, the proof of Theorem 
(5.6.1) extended to time-dependent generators yields the following result. 


Theorem 5.6.2 ([103]) Let {yr}:>0 be a one-parameter family of hermiticity and 
trace preserving linear maps on the space of states of a d-dimensional quantum 
system, which is generated by the time-dependent generator in (5.217). Then, the 
intertwiners Ņy; s are CP if and only if the Kossakowski matrix C; = [Cap (t)] = 0. 


Proof As inthe proof of Theorem 5.6.1, the intertwiners y s generated by i dual to 
L, in (5.217), hence dual to the intertwiners 7,5, are CPU maps on Ma (C) for all t > 
s > 0 if and only if T := id4 Q Yrs is a positive, unital map on Ma (C) ® Ma (©). 
By differentiation of nit ç at t = s, positivity holds if and only if 


L(y, 6) := (Y lida 8 LET] ON OII) 


for all orthogonal 7, 6 € C4 @ C4. Then, the argument developed for proving the 
“only if“ part of Theorem 5.6.1 yields C; > 0, while the “if* part follows exactly as 
in the proof of that theorem. 


Ithas become customary to identify quantum non-Markovianity [103] with lack of CP 
on the part of the intertwiners, or more mathematically with lack of C P-divisibility. 


Remark 5.6.6 As much as for the positivity of the maps ~y, in the case of dynam- 
ical semigroups (see Remark 5.6.4.6), the positivity of the intertwiners 7;,, can be 
inspected by looking at the positivity of the expectations (wv | L;[|@)(@|]|w) for 
all orthogonal |w),|@) € C4. As much as for semigroups, the problem is exactly 
that of establishing necessary and sufficient conditions for positivity. Particular cases 
where this is indeed possible are discussed in Proposition 5.6.3. 


Though Markovianity is usually identified with the lack of memory effects embodied 
in the one-parameter semi-group composition law (5.204), it has become customary 
to extend it to two-parameter semigroups as in (5.220) according to the following 
definition. 
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Definition 5.6.2 A one-parameter family of unital CP maps {y;"};>0 on Ma (C) is 
called Markovian when the intertwiners yp, s are (unital) CP maps for all t > s > 0, 
in which case the maps ~ are said to be CP -divisible. The maps 7," are called P 
-divisible if the intertwiners i ; are positive maps for all t > s > 0. 


In the following we shall deal with dynamical maps on the states of open quantum 
systems which will be termed CP -divisible and P -divisible if such are their dual 
maps on the corresponding algebras of observables. 


Proposition 5.6.2 [fa one-parameter family of unital CP maps {y;}:>0 on the space 
of states of a d-level quantum system is P -divisible, then, with respect to the trace 
norm (5.21), 


d 
gy tex < 0 YX =X" € Mı(©). (5.231) 


Vice versa, if the maps y; are invertible and (5.231) holds, then the maps ~y; are P 
-divisible. 


Proof From Proposition 5.2.1, if the maps r,s are positive they are contractive on 
X=X'e Mg (C) with respect to the trace-norm; therefore, for all € > 0, 


ly [X] = Were 0 VLA < eT 


and (5.231) follows. Vice versa, assume (5.231) holds; then, the action of the inter- 
twiners (5.221) yields 


IEX = lls o yX] < IX Yt2zs20, YX =X € M4(©). 


If the maps +; are invertible, then let us choose X = yy 'TY] where Y = Yt. Notice 
that the Hermiticity of Y guarantees the Hermiticity of X; indeed, since 7, preserves 
Hermiticity, also y7! does: 


yla YD = yi = Y = orp = y Y]. 


Therefore, ||¥,s[¥]|]1 < || ¥||1 for all Hermitean Y; hence, the intertwiners are con- 
tractive and thus positive according to Proposition 5.2.1. 


Example 5.6.9 Let us consider the case of Pauli addressed in Example 5.6.8. If X = 
Xt, using Example 5.4.1 and (5.226) one gets that the 4-vector (xo, x) associated to 
X gets mapped by ~% into 


(xo, e™" cosht x1, e™* sinht x2, e~” x3) : 


so that the trace-norm ||7;[X]||1 is either constant in time, 2 |xo|, or 


l+e7+e-4 
Iye(X1 Ih = eee a? + x3) + eH 
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whose time-derivative is never positive. Therefore, being the maps y, invertible (see 
(5.229)), Proposition 5.6.2 show that the intertwiners 7;,; defined by (5.230) are 
positive and thus the CP maps y, defined by (5.226), though not CP -divisible (see 
Example 5.6.8), are P -divisible. 


The non-Markovian dynamics discussed in the previous examples consists of 
maps that are particular instances of unital, trace-preserving CP maps A having the 
Pauli matrices as “eigen-vectors”, namely, such that A[o;] = A; aj. Suppose indeed 
that the time-dependent generator is such that 


L{[1]=0, Lilaj] =A; to; j=1,2,3. (5.232) 


It is simple algebra to show that the GKSL form of L; is 


I 3 
Lilel = 5 S-nj@(oj po; — p), where (5.233) 
j=l 
p(t) = A(t) + A3(t) — A(t), (5.234) 
p(t) = Ar (t) + A3(t) — A) , (5.235) 
fi3(t) = A) + A2(t) — A3 Ct) . (5.236) 


Then, as in Example 5.223, integration of 0;y;[0;] = Lily:[oj]] yields 
WIN=1, yvlojl=ajMoj;, j=1,2,3, (5.237) 


t 
aj(t) = exp (-/ ds Àj ©) (5.238) 


Proposition 5.6.3 The maps \; defined by (5.233)—(5.236) are 


1. CP if and only if 


la3(t)| <1, [+A] = |ai@) + aa(t)| ; (5.239) 


2. positive if and only if 
t 
/ ds Aj(s)>O0 Vj =1,2,3; (5.240) 
0 


3. CP -divisible if and only if for a cyclic permutation (i, j, k) of (1,2, 3) it holds 
that 


Mt) +A = RO; (5.241) 
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4. P -divisible if and only if 
AjOz0 Vi =1,2,3. (5.242) 


Proof The Choi-matrix I, = id2 ® 7; reads 


1+ a3(t) 0 0 a(t) + a2(t) 
ripey=t 0 1—a3(t) ar(t) — a(t) 0 
Hoti 0 ailt) — a(t) 1-— a(t) 0 
a(t) + a(t) 0 0 1+ a3(t) 


Then, the conditions (5.239) for complete positivity follows from the request that 
T,[P2] > 0, while the conditions (5.240) for positivity correspond to any initial 
Bloch vector (r1, r2, r3) (see Example 5.5.1) evolving into (aq (t)r2, a2(t)r2, &3(t)r3) 
at t > 0, remain within the Bloch sphere. 

On the other hand, the conditions (5.241) for CP -divisibility follow from 
Theorem 5.6.2 which identifies the latter with the diagonal Kossakowski matrix 
C, = diag(u1 (t), y2 (t), u3(t)) being positive semi-definite. 

Finally, the conditions (5.242) follows from P -divisibility since, using Propo- 
sition 5.6.2, the latter property implies that the time-derivative of the trace-norm 
ly [X] ||, of any X = Xİ € M2(C) is non-positive. In particular, 


d d 
qe lela =2 HO = 2A) Vj = 1,23. 


Vice versa, if (5.242) holds, then only one of the yz; (t) in (5.234)-(5.236) can be 
negative. Let it be Ay (t) = —|A1(t)| < 0 and consider the mean value with respect 
to a state vector | % ) of the action of the generator on a projection onto a state vector 
|) orthogonal to | 4%) : 


3 
(VIALE) OID), AX = Do pjMoj Xo; . 


j=l 


1 
Lal OM el te =; 


Then, 
(WIL AON Oly) =O Vib), |o) € C? with (wd) =0. (5.243) 
Indeed, 
O| — pat) = —Ag(t) — A36) + AL) — ALM — A3(t) + A2() = —2A3() < 0, 
and analogously |; (t)| < u3(t). One can then use Proposition 6.2.2 with d = 4, 
Lo = 0, L123 = 01,23/V2 and £1,2,3 = H1,2,3(t). As a consequence of (5.243) 


and of Remark 5.6.6, the intertwiners yr s, t > s > 0, generated by L; are positive 
and thus the maps 7; are P -divisible. 
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Example 5.6.10 The maps considered in Examples 5.6.8 and 5.6.9 are a straightfor- 
ward application of the more general structure discussed in the previous Proposition, 
with t > 0 and 


Ai (t) = à (t) = 1—tanht>0, A3(t)=2, 
u(t) = pot) =2, p3(t) = —2 tanh, 


ailt) = a(t) = e™ cosht, a3(t) =e”. 


5.6.7 Back-Flow of Information 


In Sect. 5.6.5, we have seen that the somewhat abstract justification of complete 
positivity based on the statistical coupling to inert ancillas can be given more physical 
ground by letting the ancilla evolve in time according to the same dynamics of the 
open system of interest. In a non-Markovian scenario, the following Proposition 
corresponds to Proposition 5.6.1 as much as Theorem 5.6.2 corresponds, in the case 
of time-dependent generators, to Theorem 5.6.1 in the case of one-parameter semi- 
groups. While Proposition 5.6.1 establishes that y; ® y, can be positive if and only 
if y; is CP, the next one states instead that 7 © +; is P -divisible if and only if 7; is 
CP -divisible. As we shall see in this section, this result has interesting consequences 
in relation to the phenomenon of back-flow of information which may emerge when 
memory effects are present. 


Proposition 5.6.4 ([33]) Let {yr}:>0 be a one-parameter family of dynamical maps 
yı on the space of states of a d-level quantum system with generator as in (5.217). 
Then, the linear maps T; = yı ® yz are P -divisible if and only if the maps yj; are CP 
-divisible. 


Proof One implication is straightforward: if y, is CP -divisible for all £ > 0, then, 
because of Definition 5.6.2, the intertwiners ~y, s are CP for all t > s > 0. Then, the 
maps Iņ; s := Yt,s ® Yr,s are positive and thus the maps J; are P -divisible. For the 
other implication, one considers the short-time expansion 


Yt+e,t D V+et X ida ida + € (Li @idg + idg @ Lr) . 
Then, as in the proof of the second part of Theorem 5.6.1, the request that 
LY, 9) = (Y| (Lr @ idg + idg ® LAI) olly) = 0 


for all orthogonal Y, ¢ € C4 @ C? implies the positive semi-definiteness of the time- 
dependent Kossakowski matrix C; = [Cgp(t)] in (5.217). 


The notion of Back-Flow of Information (BFI in short) relates to the possibility 
that, when a physical system is coupled to an environment, the presence of memory 
effects in its reduced dynamics may reflect that information which goes from the open 
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system to its environment might then be stored there and then retrieved by the system. 
This phenomenon should be witnessed by states becoming more distinguishable 
because of the BFI, while, without memory effects information is simply lost to 
the environment with the result that initially different states become less and less 
distinguishable in the course of time. These plausible and intuitive arguments are 
formalized by looking at the so-called generalized trace-distance [86,378]. 


Definition 5.6.3 Given a statistical ensemble consisting of two density matrices 
p1ı,2 € BCH) with weights 0 < u < 1 and 1 — u, their generalized trace-distance is 
given by the trace-norm of the Helstrom matrix: 


Dy(p1, p2) = VAp(pr, pail, Ap(p1, 2) := ber — Appr. (5.244) 


The generalized trace-distance extends the notion of trace-distance introduced in 
Definition 6.3.4 to which D,,(p1, p2) reduces for u = 1/2. Moreover, as discussed 
in Example 5.6.5.2, the trace-norm of the Helstrom matrix is indeed a measure of the 
distinguishability of two density matrices. In [86] it is used to introduce the following 
notion of Markovianity. 


Definition 5.6.4 A one-parameter family of unital CP maps {y;"};>0 on Mg(C) is 
called Markovian when, for all density matrices p1,2 and weights 0 < u < 1, the 
time-derivative of the generalized trace-distance satisfies 


d 
gp Delon, yl <0 Yt>0. (5.245) 


This definition is connected to the flow of information between the system S whose 
reduced dynamics is ~; and the environment £ at its origin. If U, describes the unitary 
dynamics of $+ E and p® pz its the initial state, the invariance of the spectrum 
yields 


Dy (vipi S peU; , Uim peU) = D, (P1 8 pe, P28 pe) 
= |401, P2) 8 peli = Ap (er, 2) lh eels = Du (1, p2) , 


where in the last equality it was supposed, for sake of simplicity, that the environment 
state is trace-class. The left hand side in the previous equality is thus an expression 
of the external information Té*"' (p1, p2) about the states p1 ,2 that cannot be accessed 
through the system S only. Notice that such an external information is time-invariant: 
TP" (p1, p2) = Tfžo(P1, p2). Instead, 


Ti” (p1, p2) = Dp (loil, YeLe2)) (5.246) 


is the internal information accessible only through the dynamics of the open system 
S. Therefore, the difference 


TE (p1, p2) := Dy (Uio @ pkU; , Uim Q peU;) = Dy leil, vez) 
(5.247) 
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represents the external information which is accessible through the environment 
only. From the fact that 


dTi”! (p1, p2) dT” (p1, p2) 
dt dt 


T (p1, p2) + Zi" (p1, p2) = Ti=0 (1, p2) => 


it follows that to an increase in the internal information as in Definition 5.6.4 there 
corresponds a decrease in the external information stored in the environment. 

The connection between the two notions of Markovian dynamics, the one based 
on CP -divisibility given in Definition 5.6.2 and the one based on the non-increasing 
of the generalized trace-distance in Definition 5.6.4 is offered by Proposition 5.6.2. 


Proposition 5.6.5 Given one-parameter family of invertible, unital CPmaps {%7 }:>0 
on Ma(C), then the trace-preserving dual maps Ņ; satisfy 


d 

gz Pelei, ywl) <0 Yt>0, 

for all density matrices pı 2 € Ma(C) and0 < u < I, if and only if the dynamics Yı 
is P -divisible. 


Proof According to the first part of Proposition 5.6.2, if the maps 7 are P -divisible, 
then (5.231) holds with X = A,,(p1, p2). Vice versa, given any X = Xt € M4(©), 
by decomposing it into its positive parts X+ and normalizing them via the trace-norm 
||X+|]1, any such X is proportional to a Helstrom matrix: 


Tr(X4) X4 X 


X= |X Ili An (pi, p2) > HS XIla : P15 Tr(X+) oa Tr(X_) 


Thus, if (5.245) holds as assumed, then (5.231) follows and thus, from the second 
part of Proposition 5.6.2 the P -divisibility of the maps 7; ensues. 


Remark 5.6.7 Propositions 5.6.4 and 5.6.5 imply that invertible y, that are P - 
divisible, but not CP -divisible, do not show B F I but give rise to 7; ® 7 that cannot 
be P-divisible and thus show B FI. Thus a peculiar superactivation of BFI emerges 
in that a flow of information from environment to system can be generated by statis- 
tically coupling to a copy of itself a single non-Markovian open quantum system for 
which no information can flow to it from the environment. 
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Quantum Information Theory 


In the last years, a considerable amount of theoretical and experimental studies 
have been focussing on the impact that quantum mechanics may have on computer 
science, information theory and cryptography. We shall loosely refer to this vast and 
variegated field as quantum information [64,90, 122, 156, 183,266,282,287,360]. In 
the following, we shall briefly touch upon a small fraction of its many achievements. 


6.1 Quantum Information Theory 


Why quantum information? Is it not classical information sufficiently powerful a 
theory to satisfy our needs? The answer is that it will indeed be so until computa- 
tional models and information transmission protocols are based on classical physics. 
Indeed, information is physical [71,222] for it is carried by physical entities, trans- 
mitted and manipulated by physical means; as a consequence, any actual informa- 
tion processing protocol will rely upon a model describing the physical processes 
involved. Since Nature is considered to be ultimately quantal, one is inevitably led 
to consider a scenario in which quantum mechanics will set the rules of the game 
also in dealing with information and its manifold aspects. 

Roughly speaking, the issue at stake is the use of qubits instead of bits as funda- 
mental informational resources so that one has the whole Bloch sphere of two-level 
system states at disposal instead of just the two states (up and down along the z- 
direction) that are available to classical spins. 

When the information that is manipulated regards computational processes, the 
question is whether Quantum Turing Machines (QTMs), that is computing devices 
based on the laws of quantum mechanics, might perform better than classical Tur- 
ing machines. A breakthrough was indeed the discovery that relevant speedups can 
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be gained by quantum algorithms because of the huge parallel computation made 
available by the possibility of linearly superposing qubits states. 

Truly, from an abstract perspective, as much as classical mechanics is contained in 
quantum mechanics, also classical information, computation and cryptography may 
be thought of as commutative versions of more general theories, still in their infancy, 
that are to be soundly formulated within a quantum, non-commutative, framework. 
However, the need to elaborate these more general theories is not only justified in 
line of principle, but comes from concrete facts. The pace at which every two years 
electronic devices double their efficiency (the so-called Moore’s law) and decrease 
in size is such that non-classical effects will soon appear and quantum mechanics 
will become necessary to cope with them. 

If information is carried by qubits , then the possible reversible operations to 
which they can be subjected are all those corresponding to unitary matrices in M3 (C): 
these are called quantum gates. In the classical case, the only non-trivial gate on bits 
O, 1 is that which flips them, 0 +> 1, 1 > 0. Consider, for instance, the Hadamard 
transformation in (5.60); its n-fold tensor product ue acting on |0)8” produces a 


uniform linear combination of kets labeled by the 2” binary strings i” € a”, in 
one stroke: 


ug" o") == SS es (6.1) 


Linearity is at the basis of quantum parallelism: suppose that the computation of 
a binary function f : ae bt a on n bits with n-bit strings as images, i” > 
(€G ™)™®, can be operated by means of a unitary transformation 


k) e Uş ie) = ea) 


on n qubits. By Uf f is computed on all strings at once, as follows: 


u,Us" |o) 2 = 1 > ay) . 


The linear structure of quantum mechanics seems to provide a more powerful setting 
than the classical scenario; however, the extraction of information out of quantum 
states is a much more delicate problem than with binary strings. 

Any computation performed by a QTM on n qubits must correspond to a unitary 
operator on (C7)®"; then, a quantum algorithm acting on an initial state of the n 
qubits would halt in a linear combination of all possible computational basis vectors 
in (C*)®”, each one of them corresponding to a classical n-bit string i (n) occurring 
with a certain amplitude C (i™). If the solution of a problem is a specific binary 
string i”, an efficient quantum computation must associate to that string a very 
high probability, |C(i)|? ~ 1. Only in this case the solution would show up with 
almost certainty from a measurement in the computational basis. 
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Example 6.1.1 (Deutsch-Jozsa Algorithm) Let f : Qe” +> {0, 1} be a binary func- 
tion that is known to be either constant or balanced, that is f Gi”) = 0 on half of 
the n-digit strings and f (i) = 1 on the other half. The task is to decide between 
the two possibilities. Classically, the only way to ascertain whether f is constant 
or not is to compute it on 2”~! + 1 strings, that is on half plus one of them; this is 
because one can compute always 0 or always 1 on 2”/2 strings in a row without the 
function being constant, so that only one more computation can settle the question. 
On the other hand, if the bit strings i” could indeed be treated as computational 
basis vectors li 9) in the Hilbert space H = (C*)®” of n qubits, then the follow- 
ing quantum algorithm would answer the question in just one trial. It is based on 
a generalization of the CNOT gate of Example 5.5.9; instead of one control qubit, 
there are n of them all prepared in the same state |0) together with one target qubit 
in the state |1). As seen in (6.1), 


Ba 10) — 11) 
y) = YLC+HD en @ | =( ) im) g , 
prer onean z) È | 7 
DegP 


;(n) 
The matrix M (C) ® M2(C) 3 Uf = X jweg® IJO I 18 af is wni- 
tary and flips the last qubit only if f (j™®) = 1. This yields! 


m) g 0e FE) — [Le FE) 
1 E Vi 


|Mr) = Us |W) = (=) > 


Me Qu 
1 y y j(n) 10) — 11) 
sf Sa pre ha @ 
v2 eg V2 
Applying the Hadamard rotation on the first n qubits (see (5.60)), one gets 


|F) = UR" anes f) ( ener pea 


1, joeg 
10) = 11) 


Q —. 
J2 
where i”) . j™® := YI _, ix jg. Since projecting onto |0)®” yields 
ie e l0) — |1) 
oe (-1 fe ) | 108 @ f 
(z) = Mee 


IMeQ” 


' The CNOT gate (5.175) corresponds to choosing f (i) =i, i = 0, 1. 
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the amplitude of |0)®” in | Č) is Oif f is balanced, +1 if f is constant. Therefore, after 
operating the circuit one has just to perform a measurement in the computational basis 
{|i alee 2” of the first n qubits : if |0)®" occurs the binary function is constant, 


otherwise it is balanced. 


The preceding discussion regards instances of classical information being encoded 
into quantum states and manipulated by quantum gates. Similarly, classical informa- 
tion might be stored or transmitted by quantum means and the question is then how to 
retrieve it with high reliability or fidelity. Sending classical information encoded into 
non-orthogonal quantum states has indeed the advantage of being protected against 
undetected eavesdropping. 


Example 6.1.2 (No-Cloning) Suppose a sender A encodes the bits 0 and 1 into 
the qubits |o) , |) € C? with |o) Æ |¢1) and sends them to a receiver B. If a 
spy E wants to access this amount of information without being spotted, he/she 
has to read the transmitted state without changing it, otherwise sender and receiver 
might get alerted. A way to do this is for the spy to intercept the message during 
transmission and to copy it by means of a unitary operator Ug acting as follows 
Ur (IY) ® le)) = |v) @ |y). But unitarity implies 


2 
(yol Y1) = (Yo 8 el Y1 8e) = (yo @ e|ULUE yi @ej= (wol v)) ; 


whence 7~o,1, not being equal, must be orthogonal. Therefore, if the code states Wo,1 
are chosen not to be orthogonal, the spy cannot copy them without alterations. This 
argument goes under the name of no-cloning theorem and asserts that there cannot 
exist a unitary operator U that implements the operation of copying two generic 
quantum states, unless they are orthogonal. Indeed, if such an unitary operator U 
existed, then, on the linear combinations of two orthogonal states |) , |@), 


U((a I) + 816) @le)) =a WY) @ lb) + B14) 8 1g) 
= (a |) + 14) ® (a 1b) + 6 10) 


= |a |b) 8 ld) + IBF 1%) @ Id) 
+ aß (10) 8 1) + 14) @ Iw). 


This can only be true if either a = 0 or 3 = 0 as one can see by scalar multiplication 


by |#) 8 |). 


In Sect. 5.5.7, it has already been emphasized the central role of entanglement as 
a resource for quantum informational tasks. Among the applications of entangled 
states to information transmission are the protocols for the so-called dense coding and 
teleportation. In the first case, 2 bits can be sent with one use of an entangled quantum 
channel, which points to the possibility of achieving higher channel capacities if 
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Fig.6.1 Teleportation 


channel behave quantum mechanically. In the second case, quantum states can be 
transferred between distant parties sharing an entangled state by means of local 
quantum operations and classical communication (known as LOCC operations). 


Example 6.1.3 (Dense Coding) If sender A and receiver B share the entangled state 
(5.174), A can encode the pairs of bits (xy) into the Bell states of Example 5.5.9 by 


local operations performed on his qubit, only. Indeed, the states ey) result from 


acting with the Pauli matrices on the first qubit of |Woo): explicitly 
lin) = {o}) @ to), x,y = 0,1. 


Then, if A and B share |Woo), in order to send B two bits (x, y) of classical infor- 
mation, A acts on his qubit with o3 oT and sends it to B. When both qubits are with 
him, B has them in the state ra a by performing a measurement in the Bell basis, 


he can thus recover the pair (xy). Roughly speaking, one can transmit two bits at the 
price of | qubit, that is by just one use of the entangled quantum channel represented 
by |Woo) and its local modifications. 


Example 6.1.4 (Teleportation) Suppose A has two qubits, denoted by 1, 2, the first 
one in the state |Y); = a |0); + |1); and the second one being one party in the 


symmetric Bell state Hoo). = z ae li) ® |i)3 together with a third qubit (3) of 


B. Let B perform a Hadamard rotation on his qubit in Noo) , changing the entangled 
state into (see Fig. 6.1) 


1 
x 1 
|®)o3 := Cl @ UxZ)|Y0)23 = — )_ li)o2 @ Uy |i) - 
Z2 


The state |y) ; can now be teleported from A to B becoming |Y}3. The protocol is 
as follows; A performs on his two qubits 1, 2 a measurement in the ONB {|W") 1270 


of C? @ C?, where |W) > := o, Q Un|Woo)12 with (W| Wy) = 4Tr(opov) = Spv- 
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Notice that the amplitude of LAM in the state |Y); ® |®)o3 is 


I 1 


n (Pu (1 @1P)a3) = 5 2 (ilou lb) (i Ua li) Un Lis 
i,j=0 
1 
lw, T 
= 5 È (ilo lp) Un lis = y ld)s » 


i=0 


where it has been used that Up = U}, and Ce = Il. Thus, after A has classically 
(that is by means of a classical channel) communicated to B the result of his local 
measurement, B knows his qubit to be in the state o, |~)3, whence by a local rotation 
by c, he gets his third qubit in the state |7))3. 

The procedure does not violate no-cloning for the state that appears at B’s end, 
disappears from A’s end. Neither does it violate Einstein’s locality; indeed, before 
classical communication of the actually measured index ju, B’s state is the equidis- 
tributed mixture of the four possibilities corresponding to the four different mea- 
surement outcomes of A; explicitly, using Example 5.2.5 with the normalized Pauli 
matrices On / /2 as ONB, 


ee 1 
er Oa 


On the other hand, before A’s measurement, the marginal state of B is 
p3 = Tri 2Y) 1 (W| @ IP) 23 (PI) 


1 
1 ee a Roo the 1 
= Tehi 1 (I) 5 D> Te(id22 (il) Ua ls3 Ua = 5 - 
i,j= 
Notice that the net effect of quantum teleportation is to get the third qubit in the 
rotated state o, |Y} by means of a measurement in the ONB wy) p performed 


} 

125 

on qubits 1 and 2, when the state of 1, 2, 3 is |Y); 8 |®)a3. 

Example 6.1.5 In order to implement a two-qubit gate like the unitary Ucyor on 
two target qubit states %1,2, one adds to them three pairs of qubits each of which in 


the same entangled state |Ø) introduced in the previous example. Thus, one deals 
with a multipartite entangled state [90,294,384] 


IW) := l1) 8 |W2)2 8 |G) 34 @ |H)57 8 |P)eg 


corresponding to the scheme in Fig. 6.2. 
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Fig.6.2 One way quantum computation 


Then, measurements are performed on qubits (1, 3, 5) and (2, 4, 6) in the ONBs 
obtained from the GHZ vectors as in Example 5.5.10.3. By projecting |W) onto the 


6 qubit state |Wapc) 135 ® | Wer lacs? one computes 


i. a 
(135 (Yate! © 246(acs|) = (=z) Yo (rlogh) (rob li) rloS li) 


i,j,k=0 


x (s|of |v) (s |ofUn li) (sod Ik) Un lj)7 ® Un Ik)g 


fh a 
= (=) S (rot lyi) (sof [be )(s |ofUno? |r) UnoS Ir)7 ® Uno? |s)g 


r,s=0 


= (4) (Unos ® Unos) ies (of ® of) ld1)7 @ Iv2)g , 


where in summing over i, j, k it has been used that, under transposition, of 3 = 01,3. 
Thus, a part from local unitary rotations, by measuring in the chosen ONB one 
implements the unitary transformation 


1 
UP = J (slo{Unot Ir) irr] @ls)(s| 
r,s=0 
1 
= X (s@e|Un |r Ob) |r)(r|@ls){s| 


r,s=0 


on the state |7,)7 Q |wW2)g of the pair of qubits that remain unaffected by the mea- 
surement. In particular, choosing a = b = c = d = e = f =O, it turns out that 
V2U% = |0)X{0|8 1+]1)(1]| @ 03 = 1 8 Uy Ucnor L ® Ug, namely V2 US? 
amounts to the CNOT gate unitary matrix apart from unitary, local rotations. 


290 6 Quantum Information Theory 
6.2 Bipartite Entanglement 


We have seen in Sect. 5.5.7 that, by looking at its marginal states, one knows whether a 
pure bipartite state is entangled or not. For density matrices entanglement detection 
is by far more difficult; only in low dimension the problem has been completely 
solved by the so-called Peres-Horodecki criterion [179, 183,279]. 


Proposition 6.2.1 Let a bipartite system Sı + S2 be described by the algebra 
Ma, (C) ® Ma, (C), a state pi2 € S(S1 + S2) is entangled if and only there exists 
a positive map A : Ma, (C) +» Ma, (C) such that ida, ® At[p12] is not positive def- 
inite, where AT : Bi (CL) > Bi (C4) is the dual map of A from the space of states 
S(S2) = Bi (CB) to the space of states S(S1) = Bi (C4). 


Proof The set Ssep(S1 + S2) of separable states over Ma, (C) ® Ma, (C) is the clo- 
sure in trace-norm of the convex hull of pure separable states (see Remark 5.5.7). 
By the Hahn-Banach theorem [305], Ssep(S1 + Sz) can be strictly separated from 
any entangled state pent by a hyperplane, that is by a continuous linear functional 
R: S(S; + $2) + R and a real constant a such that R(penr) < a < R(Psep). AS 
the trace-norm and the Hilbert-Schmidt topology are equivalent in finite dimension, 
using the argument of Example 5.2.4, the action of R can be represented by means 
of R= Rie Ma, (C) ® Ma, (C) such that R(p) = Tr(R p). Setting S := R’ — all, 
it thus follows that p € Ma, (C) ® Ma, (C) is entangled if and only if there exists 
S € Ma, (C) ® Ma, (C) such that Tr(S p) < 0 while Tr(S psey) = 0 for all psep € 
Ssep (Si = S2). 

Furthermore, to any such matrix, the Jamiołkowski isomorphism (see 
Remark 5.2.6) associates a positive map As : Ma, (C) + Ma, (C) with S as Choi 
matrix. Let A p : i (C2) > i (C4!) be its dual such that 


Tr(S p) = Tr (ida, @ AsLP] p) = Tr (Pfida, 8 A¢lpl) . 


for all p € S(S; + Sz). If p is an entangled state such that Tr(S p) < 0, then idg, ® 
A$ (pl cannot be positive definite. Vice versa, if idą, ® At[p] > 0 for all positive 
A: Ma, (© |e Ma, (C), then pE Ssep(S1 + $2). 


As a consequence of the previous argument, a map A: Ma, (C) => Ma, (C) is 
a witness of the entanglement of the state p € S(S1 + $2) if idg, @ A* turns p 
into a non-positive matrix. Therefore, A cannot be a CP map; however it preserves 
positivity. Indeed, the Choi matrix L € Ma, (C) & Ma, (C) associated to the dual 
map A* is block positive for Tr(L p) > 0 whenever p is separable, that is (Y @ 
ILIY @ ¢) = 0 forall y € C% and ġ € C2, whence At is a positive map. 

Unfortunately, as already noticed (see Remark 5.2.7.3), unlike CP maps for which 
Proposition 5.2.4 holds, positive linear maps still lack a complete characterization. 
Consequently, given an entangled state p € S(S1 + S2) it is usually rather difficult 
to find a corresponding entanglement witness A. A relatively understood sub-class 
of positive maps is the following one. 


6.2 Bipartite Entanglement 291 


Definition 6.2.1 (Decomposable Maps) A map A : B(H) +> B(K) is decompos- 
able if it is positive and A = A, + 42 o Typ, with 41,2 CP maps and Ty the trans- 
position on B(H) with respect to a fixed orthonormal basis in H. 


Let (d1, d2) = (2, 2), (2, 3), (3, 2), then a theorem of Woronowicz [381] asserts 
that all positive maps A: Ma, (C) +> Ma, (C) are decomposable. This fact makes 
transposition an exhaustive entanglement witness in low dimension; in other words, 
for the stated dimensions, those states that remain positive under partial transposi- 
tion, are separable and viceversa. 


Corollary 6.2.1 Ifin the previous proposition (d1, d2) = (2, 2), (2, 3), (3, 2), then, 
p12 € S(S1 + S2) is entangled if and only if T [p12] is not positive-definite, where 
TO) := ida, ® Tg, denotes partial transposition on the second factor. 


Proof If p € S(S; + S2) is separable then T [p] > 0 for transposition is a positive 
map. Vice versa, because of the assumption, Woronowicz theorem ensures that any 
positive map is decomposable. Therefore, if T® [p] > 0, it turns out that, for all 
positive A : Ma, (C) => Ma, (C), 


ida, ® Alp] = ida, ® Alo] + ida, 8 A201] = 0, 


as 41,2 are CP maps. 


Remarks 6.2.1 


1. Though partial transposition as transposition are to be defined with respect to 
a chosen ONB, the spectrum of an operator is basis-independent; therefore, the 
non-positivity of T® [p12], thence the entanglement of p12, does not depend on 
the ONB with respect to which the partial transposition is performed. 

2. Those states which remain positive under partial transposition are called PPT 
states, otherwise NPT states, namely negative under partial transposition. 
Woronowicz theorem does not extend to higher dimension; there are instances 
of non-decomposable positive maps already for dı = d2 = 3 [102, 158,340,381]; 
as a consequence partial transposition is not an exhaustive entanglement witness 
in higher dimension. In other words, all NPT states are entangled, but there can 
exist PPT entangled states [180,350]. 

3. No pure bipartite state can be PPT entangled; indeed, by Proposition 5.5.8, entan- 
gled state vectors |Wj2) € C% @ C% have a Schmidt decomposition (5.169) of the 


form |W2) = Lahe l2), where d := max{d, do}, ToS 
j=1 


are orthonormal sets in the Hilbert spaces C%.2 and the Schmidt coefficients 
A; > 0 for at least two indices. Set P12 := | W12 ){ W12 |; the partial transposition 


with respect to the ONB having wre, among its elements yields 
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Rn = TOP 2] = are De? ap? OLDE NYP I. 
„J= 2 


i 


Let àı2 > 0, then 


Pyp) - Pye) 7 


i il moe) 
Z ; 


om Z 


R12 


Thus T Pio] cannot be positive if P12 is entangled. 

4. The entanglement of PPT entangled density matrices cannot be detected by 
decomposable positive maps as one can see from an argument similar to the one 
used in the proof of Corollary 6.2.1. An instance of such states will be discussed 
in Example 6.2.2. 


The following ones are families of bipartite states over C4 @cé4 , d > 2, where 
PPT states are always separable. 


Examples 6.2.1 1. Werner States [375] This is a class of d? x d? density matri- 
ces on C? @ C4 of the form pw = all 2 + 8 V where V is the flip operator (see 
(5.35)) and W := Tr(pwV) (*). 

As the eigenvalues of V are +1, those of pw are a+ 8 and must be positive. 
Also, V? = Ig and TV = E alij IV lij) = Dy | Gl j)? = d; thus, 
normalization and (*) yield ad? + Gd = land W = ad + Gd’, whence 


d—W B dw -1 +8 1+W B 1—W 
= ; = 5 Q = > a — 

d(d? — 1) d(d? — 1) d(d +1) d(d— 1) 
d(d-W) ip  dW-1 


d*—1 d d(d* — 1) 


V, 1<W<1. (6.2) 


pw = 


If pw is separable as in (5.176), by spectralizing the contributing density matri- 
ces, it can always be recast as pw = D>; ijl Vj (Vj 1 @147)( 47 |, mij = 0, 
ij hij = 1. Then, 


W = TrowV) =P m O RIV @ UF) = Do ml A) = 
ij ij 
This is a necessary condition for the separability of Werner states in dimension d; 


vi : f 2 
itis also sufficient; the clue is that Werner states are exactly those states on C% that 
commute with all unitaries of the form U & U with U a unitary matrix in Ma (©). 
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Practically, since V (A ® B)V = B & A forall A, B € Ma (C), all Werner states 
have the form 


pw = | dU (U @U)p(U' 8 UÝ), 
uU 


where dU is the normalized, invariant Haar measure over the unitary group U on 
C4? ; furthermore, W = Tr(p V). 

If1 > W z0, lety € C4, lp) = VW |p) +[vT=W |t) € C4, with (| Yt) 
= 0 and set p:=|¢)(¢| ®|wW)(w|. Thus, pw arises by twirling a separable 
state with tensor products of local unitaries, therefore it is itself separable, as 
local actions cannot create entanglement. 

The necessary and sufficient condition for separability, W > 0, turns out to be 
equivalent to pw being positive under partial transposition. Indeed, by applying 
TË = idg ® Tg to pw one gets 


W dw —-1~ 


d— 
T? [pw] = 1 l 
[ow] Iai e+ a7 Pt 


Its eigenvalues (d — W)/(d? — d) > Oand W /d are positive if and only if W > 0. 
2. Isotropic States [182] This is a class of d? x d? density matrices on Cf & 
C7 which are related to Werner states by partial transposition. They have the 
form p F = alg +8 p? and are uniquely identified by the parameter 0 < F := 
Tr(pr P$) (*). 
Like for Werner states, positivity, normalization and (*) yield a > 0,ad? + 8 = 
l and 1 > F =a + p > 0, whence isotropic states are mixtures of the totally 
depolarized state on C4” and of the totally symmetric state, 


@VA-F)lp &@F-1 > 
=a a aA Pos (6.3) 


PF 


Since (Y @ $| P2 |b @ b) =| (WI ¢*) |?, where y* is the vector in C? with 
complex conjugate components with respect to 4, if pr is separable, then (see 
the previous example) 


ees 1 1 
F = Troe Ph) = = mil @*)P 5 - 
ij 


? The particular convex combination of states (U @ U) p (Ut ® UÏ) appearing in the integral is 
known as twirling. Twirled p are such that, for all unitary V, 


VeV (J dUU@UpU'®@ u') vgv’ -=f dU VU @VU p (UV)? @ (UV)} 


= | avimwewpw ew = | auguput evi, 
uU uU 


for the Haar measure satisfies d(VU) = dU for all unitary V. 
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As for Werner states,0 < F < 1/d is necessary and also sufficient for separability. 
The reason is that isotropic states are all and only those d? x d? density matrices 
which commute with local unitaries of the form U @ U*, where U is any unitary 
matrix in M4 (C) and U* denotes its complex o (noti its adjoint). Moreover, 
one can show that, since (U & U*) Pd (Ut & UT) = Pe any isotropic pp arises 
from a twirling of the form 


pr= | dU (U @U*) p (Ut 9 UT), 
u 


where UT denotes the transposition of U and pis such that Tr(pP2) = F If Fd < 
1, set |$) = VdF |) + VI —4F |e) and choose p =|¥)(1 1d) 41; 
then, pr can be obtained by twirling a separable state and is thus itself sepa- 
rable. 

The above necessary and sufficient conditions for separability coincides with the 
isotropic states being positive under partial transposition. Indeed, 


@F-1 


T [pr] = eo! i oy 
@—1 * ` d(d-1 
has positive eigenvalues (d F + 1)/(d? +d) > 0 and (1 — dF) /(d? — d) if and 


only if0 < F < I/d. 


6.2.1 Distillability and Bound Entanglement 


Entangled states of two qubits as the Bell states (see Example 5.5.9) are called 
maximally entangled. Consider a pure state |W) € C @ C? of a bipartite system 
consisting of two copies of a same system; as we shall see, there are good rea- 
sons to measure the amount of entanglement of p12 by means of the von Neumann 


entropy of any of its two marginal density matrices om := Tr2,1 ( W2)(W2 I) 
(see Proposition 5.5.6). 


Definition 6.2.2 (Pure State Entanglement) Let |W42) € C4 & C! bea pure state of 
the bipartite system S1 + S2; the entanglement of |W12} is 


Erl] W12) (W12 1 = S (oy) - (6.4) 


According to Example 5.5.10.1, all Bell states have marginal states that are the 
tracial state with maximal von Neumann entropy: E[W yy] = log 2. 
The presence of uncontrollable interactions with the environment in which a 


bipartite system may be immersed usually spoils its maximally entangled states. 


01) — |10 
For instance, the so-called singlet state |W)_ := SS might be rotated into 


|W) = a |01) + 8110), with |a|? + |8|? = 1 and 0 < |a| < 1, so that 
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2 
D T E[Y] = lal? log |a]? — 8 log |B)? < log 2 . 


It may even be turned into a mixed state because the environment usually acts as a 
source of noise and dissipation. A measure of the entanglement of p12 € S(S1 + $2) 
is as follows.’ 


Definition 6.2.3 (Entanglement of Formation [69]) The entanglement of formation 
of a state p12 of a bipartite system S1 + So is the least average pure state entanglement 
over all convex decompositions of p12, 


Erlo = min} Dys (P) p= Alva Mv (6.5) 
j j 


viz 
j 
where Àj > Oand $; Aj = 1. 


Maximal entanglement is an important resource in quantum information, but also 
a highly degradable one; of particular importance are then those quantum protocols 
that enable to distil maximally entangled states out of non-maximally entangled ones 
by means of LOCC .* The basic scheme of a distillation protocol is as follows: given 
m copies of p12 € : (C4 & C4), one tries to maximize the number n of copies of 
the singlet state projection P_ := | W_ )( W_ | that can be obtained by means of local 
operations and classical communication: 


LOCC 
p12 8 p12 ®@ +++ p12 P_@P_®::-P_. 
— m a mcc 
m times ntimes 


The LOCC defining the distillation protocols amount to maps of the form 


1 + 
pe” > ply = Nr X Ati ® Az pẹ" Al; @ A}, (6.6) 


iel 
where 


Nı := (Ay An ® Ah, Ani pa") r 


icl 


3 Various entanglement measures have appeared while quantum entanglement theory has been 
developing, for a review and the related literature see the contribution by M. B. Plenio and S. S. 
Virmani in [90]. 

4 For a review of entanglement distillation and the other topics of this Section see [89] and the 
contributions by A. Sen, U. Sen, M.Lewenstein et al., W. Diir and H.-J. Briegel, and P. Horodecki 
in [90]. 


296 6 Quantum Information Theory 


while Aj; : (C4)®" > (C?)®", j = 1,2. 

Practically, one seeks distillation protocols that output states p) whose distance 
from PS” (for instance, with respect to the trace-norm (5.21)) vanishes when m —> 
+oo, while the ratio n/m is the highest possible. The optimal ratio, denoted by 
Ep[p1z2], is called entanglement of distillation and represents the maximal asymptotic 
fraction of singlets per p12 that one can achieve by LOCC. In other words, one 
can hope to distil at most n ~ m Ep[p12]] singlets P_ out of m p12 when m gets 
sufficiently large. 

It turns out that PPT states p12 cannot be distilled [181]; when p12 is separable 
this is obvious since one cannot create non-local quantum correlations by means of 
local operations and classical communication. The interesting point is that one has 
to distinguish between free entanglement, the entanglement which can be distilled, 
and bound entanglement, that which cannot be distilled. The result just quoted can 
be rephrased by saying that the entanglement of PPT entangled states is bound. This 
can be seen as follows: if the entanglement in p12 is distillable, then for some m, 
the state p in (6.6) must be an entangled state of 2 qubits. whence an NPT state 
according to Corollary 6.2.1. This implies that, for at least one index ig € J, the 
(non-normalized) state 


P12 = Alig Q Arig per Al 


lio 


S Ala (6.7) 


is NPT. Observe that, for such m and ig, A jio : (C4 jem py C?; therefore, one can 
always write A ji) = ya |k)(wjx |, where |) jx) e (C4)®” and |k), k = 0, 1, is 
any chosen basis in CĈ. Let Q j be the projections onto the subspaces of (Coe 
spanned by |2j0) and | 41); then, 


Diz := Alio ® Azi Q1 @ Q2 p8" Q018 Q2 Aig & AS; 


implies that p1) := Q1 ® Q2 po" Qı ® Q2 must be entangled, otherwise its sepa- 
rability would be preserved when passing to Pi2., 

Consider now the orthonormal bases {|b ite. =; in (CH8, j = 1, 2, such that 
Qj =|bj1)( bj1 | +| jz) (bj2 |; in the corresponding representation p4, isa4 x 4 
matrix acting on the subspace K spanned by the product states |b 1ib2 jh i j= 1,2: 
Since it ep rapona to an entangled state, by partial transposition with respect to 
the ONB {|bo. IE = p12 [P2] cannot be positive semi-definite. Therefore, there must 
exist |®) € K such that 


2 
(D |TalollP)= $O BF Oye (biba; | Tal pz lbixbze) 
i, j;k,ł=1 
2 2 
= J. Džu (bubul palbirbaj)= JO D7 Dr (bube | py" Ibirbzj ) 
i, j;k,ł=1 i,j;k,ł=1 


(®|TpR"I|®) <0 
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where Tipe | is now the partial transposition with respect to the whole ONB 
{|bo nate Also, the last equality follows because |®) is supported by the subspace 
K corresponding to the orthogonal projection Q1 ® Q2 and thus has vanishing pro- 
jections onto all |b1:b2;) unless i, j = 1, 2. 

Since NPT is a property which does not depend on the basis chosen to compute 
the partial transposition (see Remark 6.2.1.1), fix the bases {le jia in C? and 
choose in (C4 Q&I )®” the product basis consisting of vectors le Ikj€lky «++ Ikm) ® 
lez e205 ae eren): Then, Tipe" | = (12[p12])®”; one thus concludes that p12 is dis- 
tillable only if p12 is NPT. 


Remark 6.2.2 From Remark 6.2.1.3 we know that no pure PPT entangled state can 
exist; it turns out that their entanglement is always distillable and thus free. Whether 
the entanglement of generic NPT states is also free, that is whether all NPT states 
are distillable, is one of the open problems in quantum information theory [90, 183]. 


6.2.2 Entanglement Cost 


One of the first questions in quantum information has been whether, by means of 
LOCC one can turn a pure state |W2) of a bipartite system into another pure state 
|12). The answer is that this is possible if and only if the marginal states pon are 
more mixed than those of |12) in the sense of Definition 5.5.3 [266]. 

If one considers asymptotic LOCC protocols where m copies of a state |Y%2) are 
turned into n copies of a state |®12) with vanishing error when m — +00, then the 


transformation of |Y%2) into |®;2) is possible if and only if [90] 


ne Er[2] 
m ~ Epl®i2] | 


Since Ep[W_] = | (we shall use log, in the following), one can always asymptotically 
distil n < mEg[Y%2] copies of P_ out of m copies of any pure bipartite entangled 
state W12. 

Furthermore, the reverse operation is also possible; namely, protocols have been 
devised which invert distillation and, by using m copies of the singlet state P_, form, 
by means of LOCC, n copies of a bipartite pure state W12. Actually, like in the case 
of entanglement distillation, one considers the asymptotic minimal ratio m/n when 
n — +œ and ee is better and better approximated (within a suitable distance) 
by a suitable LOCC operation acting on P®” [165]. The optimal asymptotic ratio, 
denoted by Ec[p12] is called the entanglement cost of p12; it represents the minimal 
fraction of singlet that is needed to create one bipartite system in the state p12. In 
other words, for large n, one can create n copies of p12 only acting with LOCC on 
no less than n Ec[p12] singlets. 

In [266] a distillation protocol Ap is constructed which asymptotically yields 


Er[(W2] = S (of?) singlets per bipartite entangled state p12 (see (6.4)) and a for- 
mation protocol A p that asymptotically yields one copy of p12 at the cost of Er[W2] 
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singlets. It turns out that the entanglement cost and the entanglement of distillation 
equal the pure state entanglement of formation. 

By definition, Ec[Y%2] < Ep[W2] < Ep[W12]; if Ep[Y12] < Ep[Y2]; then, by 
means of the protocol A p one could asymptotically obtain = copies of Y% out 


FLY12] 
of m copies of P_ and then, using an optimal distillation protocol, extract from them 


Ep[Y2] 

EF[Y12] 

of entanglement by deterministic LOCC 3 Analogously, if Ec[W2] < Ep[¥2], then 
m 


> m copies of P_. This is impossible as one cannot increase the amount 


one could use an optimal creation protocol to obtain EclWl copies of p12 out of m 
cl¥12 
copies of P_ (for m large) and then use the distillation protocol A p to extract from 
Er[Y12] 
them m ——— > m copies of P_. 
Ec[¥12] 


For pure states, forming entangled states from singlets and distilling singlets from 
entangled states are reversible operations; it is not so for mixed states and the reason 
for this peculiar kind of irreversibility is bound entanglement [90, 181,183]. 

Consider the entanglement of formation as defined by (6.5); it can be interpreted as 
the minimal averaged entanglement cost of p12. In fact, given a convex decomposition 


of pi2 = >> ; A jl wi, (abi, |, the entropies S$ Pv) are the entanglement cost of 
j 712 
the pure states that decompose it. Hower a line of principle, it could be more 


advantageous to create the tensor product A > instead of the n copies of p12 one by 
one. One is thus led to define the so-called Maalai na entanglement of formation 


1 
[0,0] hiana : = @n 
EP (pial = lim = Erlo% . (6.8) 
Such a limit exists because the entanglement of formation is subadditive. Indeed, con- 
sider the state Dis & pe of two copies of the bipartite a S1 + Sz and suppose 
Eplp(} I is achieved at the (optimal) decompositions p8 = =}; y | g oË |. 
Since the decomposition = 


1 2 1 2 1 1 2 2 
pi ® p= Day Me 1d? (6M 1 @lOP OL? | 


d) 


need not in general be optimal for Er[p;7 8 pa: 


it follows that 


1 
Erlo} 


2 
FIPj2 8 p A 


2 

< Erlp{?] + Erl{7] . 

In [165] it is proved that the regularized entanglement of formation equals the entan- 
glement cost: Ec[p12] = ER [p12]. Moreover (see P. Horodecki’s contribution in 
[90]), it has been proved that Ec[p12] > 0 for all entangled p12. 


5 One can achieve entanglement increase by LOCC only probabilistically for certain states of a 
mixture, for instance in some of the states in (6.6), but not on the average for the whole mixture. 
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As a consequence of the fact that, if p12 is PPT entangled, no entanglement 
can be distilled from it, it thus turns out that a non-zero non-retrievable amount of 
entanglement (of singlets) is always necessary to create PPT entangled states. 


6.2.3 Concurrence 


We shall now elaborate a little bit more in detail on the entanglement of formation 
of two qubit states. 


Let $4.8 be two qubits , |0) = I1) = : the standard basis in C? and 


consider a generic two qubit state vector of Sa + Sg of the form 
|\Wap) = Coo |00) + Cp, 101) + Cio 110) + C11 11) . 


Coo Co1 


Then, the marginal state p4 = CC’, C = 
Cio Cir 


) has eigenvalues 


14V/1—C(W4B)? 


2 ’ 


C(WaB) := 2 |CooC11 — CoiCiol . (6.9) 


This expression can be recast as follows. Let |W¥ g) denote the complex conjugate 
of |W4g) with respect to the standard product basis {|i/)};, j=0,1 and denote 


|as) == 02 8 o2 Vaz) = —Cho |11) + Ch, 110) + Cig 101) — Cf 100) , 
(6.10) 
for o2 = (; Pa, is such that a2 |0) = i |1), o2 |1) = —i |0). Then, 


(as Was) = Cap). (6.11) 


a* 
pB* 
C(W,p) reaches its maximum C (Wag) = 1 when Wag is maximally entangled and 
(only) two coefficients C;; are proportional to 2 U2. 

Therefore, for two qubit state vectors, the entanglement of formation reads 


1+/1 me) 
2 


Since o2 ) L e it turns out that C (W4 g) = 0 when Wg is separable, while 


Ep[| Was )( Wasg |] = E(¥4B) := M ( ; (6.12) 


where H2 (x) := —x log x — (1 — x) log(1 — x). 
The variational problem embodied in Definition 6.2.3 is in general extremely 
difficult to solve and a general closed expression of E (p) as a function of p has been 
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found only in the case of two qubits ; it is based upon the notion of concurrence 
[380]. 

Given a two qubit density matrix p € M4(C), one first constructs the density 
matrix 


Pp = 02 @ onp*02 Q 02, (6.13) 


obtained via the operation (6.10), where p* denotes complex conjugation with respect 
to the the standard basis {|7j)};, ;=0,1- Then, the quantity C (W4 g) in (6.9) generalizes 
to density matrices as follows. 


Definition 6.2.4 (Concurrence) Let Ai, i = 1, 2,3, 4, be the positive eigenvalues of 
./pp./p in decreasing order. The concurrence of p is 


C(p) := max{A; — A2 — A3 — Ag, O}. (6.14) 
Examples 6.2.2 1. Pure states Let p = |Y) (Y|, |) € C4. Then, 
of 
ebro = |i] W, 


whence C(p) = |wh). 
2. Werner states Setting d = 2 in (6.2) 


1 +03 oi tio2 
0 0 => š 0 1| = ———_ 
}0(0) ==, oa 24 

1 — o3 oi — 102 
1)(1| = ——, 1)(0| = ————_., 
Dals, yo) = 45 


it follows that 


1 
PE = 7(1@1+01 01-02 @ 02 +03 803) 
1 
V =id@ TIP?]=3(181+01 801 +028 02 +03 803) 
1 2W -1 
pw=3(191+ 3 (01 Q01 +028 02 +0303). 


Since W € R, the algebraic relations among the Pauli matrices yield pw = pw, 


whence the eigenvalues of ,/,/pwpw./pw are those of pw, namely HW (thrice 
degenerate) and L, It then follows that 


Cas max{—W, 0} -1<W<1/2 

PW) =) max{(W — 2)/3,0} 1/2 <W <1 ’ 
whence C (pw) > 0 and pw is entangled if and only if W < 0, in agreement with 
Example 6.2.1. 
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3. Isotropic States Setting d = 2 in (6.3) and arguing as in the previous example, 
the isotropic states read 


1 


= (ie 
PF= 7 


3 


(01 @ 01 — 02 8 02 + 03 B03). 


Again, it turns out that pr = pr so that the eigenvalues of \/./prpr./pF are 
those of pp itself, namely a+ thrice degenerate and F. Thus, 


Ciopy =) ™XQF-1,0) 1/⁄4sF<1 
PF) = ) max{—(1 + 2F)/3,0}0< F < 1/4 ’ 


whence pr is entangled if and only if F > 1/2, in agreement with Exam- 
ple 6.2.1.2. 


By direct inspection, the function (see (6.12)) 
1+ /1-C()? 
ECH) = k (5) | 6.15) 


is monotonically increasing (E’(x) > 0,0 < x < 1)and convex (E" (x) > 0,0 < x < 
1)intheconcurrence. AsO < C(w) < 1, the entanglement increases from E£ (C (Y)) = 
0 for separable state vectors to €(C(w)) = 1 for maximally entangled states and 


E(x + (1 — A)x2) < AEX) + AE), OXF AK1,0<x12<1. 


Further, given a decomposition p = }_ j Pil Ys) (Yj |, let 
Oi piv = > PIC) (6.16) 
J 


(E) =D, pi tnw =D, PI ECH) (6.17) 
j 


denote the corresponding average concurrence, respectively the average entangle- 
ment. Because of convexity, it turns out that 


ECE aide) < (E=; vidi) (Vil - (6.18) 


Theorem 6.2.1 ((380]) The entanglement of formation (6.2.3) of any state p of a 
two qubit system is given by Ep[p] = E(C(p)) and is thus a monotonically increasing 
function of the concurrence (6.14). 
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Proof The right hand side of (6.18) is the argument of the minimum in (6.5) (see 
(6.12)), whereas the left hand side is an increasing function of its argument. It thus 
follows that Ep[p] cannot be smaller than E (Cmin) where Cmin is the smallest aver- 
age concurrence. Therefore, if Cmin is attained at a suitable decomposition, then 
the same decomposition yields Ef[p] = E(Cmin). We will construct a density matrix 
p= = Pj| Wj) Yj | such that (C)p=y, pilv vl = C(p) and show that no smaller 
average concurrence can be achieved, namely Cmin = C(p). 

In order to arrive at such decomposition, we first consider the expansion 
p=}; |v: )( vi |,n < 4 being the rank of p, and |v;) its (non-normalized) eigen- 
vectors such that (v;|v;) = rjô;j, with r; the eigenvalues of p. 

The n x n matrix T with entries 7;; := (v;|v;), where |0;) := o2 @ o2|v;), is 
symmetric, (v;|v;) = (v;|d;), but not hermitian and 


n 


TDi = (Tiy = X (vile) dlv) = (ril PPA Irj) 


k=1 


Thus the eigenvalues of 77* are the squares of the eigenvalues À; of ,/pp,/p in 
decreasing order (see (6.14)). Let Z be the n x n unitary matrix that diagonalizes 


TT", 


Zrt*Zt = (ZrZ1)(ZrZ")* = diag(A?, A3, 43, AZ) , 
then Z can be chosen such that Z7Z? = diag(A1, Az, A3, A4) is diagonal with the 


Aj’s as eigenvalues. Setting |w;) := Da Zi |vj) gives p = $; lw:) (wil, with 
decomposers such that 


n 
(w;|w;) = 5 ZinZ je (wle) = (ZTZ")ij = iði; . 
k,t=1 


Case 1: A} < Ax +A3 + A4. 

Because of the ordering of the À ;’s, this case is possible ifn > 3. Consider the quan- 
tity f(O) := Y$] Ae: since f (0, 1/2, 1/2, 7/2) < 0 while f (0,0, 0, 0) > 0, 
by continuity f (p) = 0 at some 4. Using the vectors |w;) introduced above, let 


i 111 1 
1711 -1-1 
lzi) Ea je”! wj), ee esz hrai a)" 
= 1-1-1 1 
where e~'4 and |z4) do not appear if \4 = 0. 
Introducing the normalized vectors lW) = |z;) /\lz;\|, from C İC = 1 it turns out 
that 
PAC 2) 


4 
p= Si lz; ley djl, and | (bil Ui) 


—_— : 2 — 
ar Allzil 
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for alli = 1, 2,3, 4. Then, the vectors ly;) and thus p are separable. 


Case 2: À; > Ax +A3 + Ag. 
Set|y1) = |w1), j) =i | w ;), if j > 2. Then p = Ð= | yj )( yj |; further, consider 


the diagonal matrix Y = diag(à1, —A2, —A3, —A4) with entries Yj; := (y;| ïj}. 

Because of Example 5.5.4, any other decomposition p = Ye 1/2; )(z; | is such 
that \z j) =i Vii |yi), with V a unitary matrix on C”. Therefore, for orthogonal 
V, the quantity 


n 


p= Dhar 127 )(z) | = DD (z;|ž Zi) "E3 Vii VjkYij 


j=l Jikel 
= Tr(VYV") =Tr(Y) = à — à2 — A3 — A = C (p) 


is independent of V . By using this invariance property, one can find a decomposition 
p= ae |zj )(z;| such that 


(|) =e) = Ife) = re (25) . 


for all 1 < j < n. Thus, its average concurrence (6.16) equals C(p), 


(C) =Y lz; C (=) =)" (z,|z)=cw. 
j=l i 


j=l 


The |z j) are constructed as follows: unless all the Y;; are already equal to C (p), 
there must be one decomposer, yı say, with Yj; > C (p), and another one, y2, with 
Y22 < C(p). Choosing V that exchanges yı with y2 and leaves the other decom- 
posers fixed, we obtain a decomposition with the same average concurrence and Yj, 
Y22 exchanged. By continuity there must exist an orthogonal matrix V such that 
(z1|Z1) = (z2|Z2) = C(p), with |z;) = X; Vit ly). Iteration of this argument for 
the remaining decomposers yields the result. 

The proof of the theorem is then concluded by showing that no decomposition can 
achieve a smaller average concurrence than C (p). Indeed, using again Example 5.5.4, 
a generic decomposition has average concurrence 


ar Yi 


i=l 


’ 


(C) p52 lza zy = Yel all = 


q=1 


where Z : C” > CÊ is an isometry: Pi (Za) = | forall 1 <i < n. Since one 
can always adjust the phases of the Z,; in such a way that Zar = (Za > 0, 
for all 1 < q < Q, then 
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n 


Q Q n 
AY. | = — AZN; 
Upe AA 2 > é (Zqi) Yui} = |At YY Zi) Ài 


q=1 i=1 q=1 i=2 


Q n 
MSO 95 ZPA = C(p) - 


q=1 i=2 


IV 


Indeed, by assumption, 


Q n n Q 
DEZA < D5 DS |(Zaa)?| Ai = AA HA < AL 


g=l i=2 i=2 q=1 


6.2.4 Two-Mode Gaussian States 


Let S be a bipartite continuous variable system consisting of two subsystems A 
and B described by annihilation and creation operators a, i = 1,2,..., p, and bi, 
i=1,2,...,q¢,p+q = f, satisfying the CCR (5.94), and arranged, as in (5.97), 
into a vector 


Ẹ =(a,b,a',b*), a = (a, ..., ap), b := (bi, ..., bg) - 


We know that a state of S described by a density matrix p is specified by the charac- 
teristic function (5.123) which now reads 


FY (2) =Tr (p7 *) =Tr (pei 2 e748) (6.19) 


where A := (a, at), B := (b, b’), Za,b := (a,b; =z p) with Za := (Zal, Za2; +++ 
Zap) and Zp := (Zb1, Zb2, «++» Zbq)- Let Ta denote the transposition with respect to the 
orthonormal basis of the occupation number states |kg) = [kat ka2 = Sips kai € N, 
of the subsystem A (see (5.95)); then, using the number state basis {|kakp)}ka,kp> 
one calculates 


Tr (pt e2"¥) = JO (kako | p™ ljajo) dado |e2@4 @ 78 [hake 
ka, k 
jade 

= X jako | plkaiy)( ale? kajole” lko) 
ka.kp 
Jatdb 
= X (kako | pliady)(kale@ lja) (ip le” Iko) 


ka.kp 
Jatdbh 
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Ai _ 
= Tr (peA 8 e78) ; 


where A’ := (at, a). The last equality easily follows from 


a p ; 
(kale |j,) = [] kai |e | jai) 


i=1 
and 
ea RAP a. N = * OT ` at mal 
(ler aCe? hy Sailer jk). 


Partial transposition thus amounts to changing annihilation operators of a chosen 
subsystem into creation operators within the Weyl operator appearing into the char- 
acteristic function of a bipartite state. 

In terms of position and momentum operators this means keeping fixed J, = 
(a +a‘)/V2,q, = (b + b')/V2 and P, = (b — b*)/(iV2) while changing P, = 
(a — a‘)/(ix/2) into —P,. This observation identifies partial transposition as a local 
mirror reflection [329]. Also in the continuous variable case, separable bipartite 
states must remain positive, hence well-defined states, under partial transposition. 
Then, if the correlation matrix associated with p” fails to satisfy (5.120) the state p 
is surely entangled. In view of the fact that positivity under partial transposition fails 
to be equivalent to separability already for two 3-level systems, one may suspect this 
to be the case for all continuous variable systems as well. Surprisingly it turns out 
that partial transposition is an exhaustive entanglement witness also for two-mode 
Gaussian states [329] (see also [124, 125,238]). 

We shall use the notation of Examples 5.5.3 and start by noting that one can 
always consider Gaussian states p with characteristic function 


G(R) = e7 IR- (È VE|R) 


as in (5.141). Indeed, as local operations that do not alter the entanglement properties, 
the displacement operators D(u) = D(u1) & D(u2) can be used to set the mean 
values Tr(p (q, P)) = 0. Partial transposition on the first mode amounts to replacing 
pı with — P1 in V thus J3 with —J; in (5.148) (see (5.141)-(5.145)). Thus p and pla 
are well-defined states if and only if both the following inequalities hold 


+h>h+h+2h (6.20) 


Ale ALE 


+4>h+h-2h. (6.21) 
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Also, the operations leading from a generic V to the standard form Vo in (5.147) are 
local ones, acting independently on the two subsystems A and B; this means that if 
a two-mode Gaussian state p with correlation matrix V is separable the same is true 
of the two-mode Gaussian state pọ with correlation matrix Vo. In [329] it is showed 
that 


Lemma 6.2.1 All two-mode Gaussian states with I3 > 0 are separable. 


Notice that, when Z3 > 0, (6.20) implies (6.21) whence pia > 0 in agreement with 
its being separable. Suppose instead that a two-mode Gaussian state p with 3 < 0 
be PPT , then its mirror reflected p7? has /3 > 0 and is thus separable by the Lemma, 
whence by a second mirror reflection also p is separable. 


Proof of Lemma 6.2.1 The strategy of the proof is to show that if a two-mode Gaus- 
sian state p has a correlation matrix V with J; > 0, then V > 14/2 whence, because 
of Example 5.5.3.3, p is separable. Because of the possibility of reducing V to the 
standard form 


adm 0 

0a 07 

uO 607? 

070 6B 

by local operations, one can equivalently show that I3 = 772 > 0 implies Vo > 
14/2. Analogously, since matrices of the form O(x) = diag(x, x71), 0 Æ x € R, 
implement local scalings of positions and momenta which preserve the symplectic 


Vo = 


matrix J = one can focus upon 


01 
—1 07 
O(y)O(x) 0 y, (0000) 0 B 
0 000a) ° 0 oady) 
a(xy? 0 yy 0 


0 aay? 0 yy? 
yy? 0 be © 


Consider the 2 x 2 matrices 


2 ~2 
ax" QX Y2 
X := > V= 
e aa) ( 2 os) 


and notice that, according to (5.142)-(5.145), their entries are correlations involving 
position (G1,2), respectively momentum operators (p1,2). Their eigenvalues are 


1 
i= slax +b”? + V4 + (ax? = bx?) 


1 
y4 = alex + bx? +472 + (ax? —bx?)?) 


6.2 Bipartite Entanglement 307 


1 1 
with eigenvectors |x+) = (5). respectively |y) = € *) such that 


X4 yı 2 —2\ „—1 = —2 E 
= =2( b +,/4 2 — bx-2)2e7?) 
a ae (ax x^) r yı + (ax x=4)*c] 
2 —1 
ee = a z= 2( (ax? bx*)cy! + J% + (ax72 — baa) 
Y+ y+ ax 


Since |x+), respectively | y+) are the rows of the orthogonal rotation matrices Ox and 
Oy which diagonalize X, respectively Y, one makes Ox = Oy = Il by choosing 
the scaling parameter x such that 


ao à 
ax? — bx?  ax™?-— bx?’ 


b 
namely x = dika Since the diagonalization of the two sub-matrices X and 
ay. + by 


Y of V$ is obtained by means of a same orthogonal rotation, the overall transformation 
J2 O2 
O2 J2 


is symplectic (that is it preserves 2 = ( ). Therefore, one can study the 


diagonal matrix 


yx} 0 0 0 
yr 0 y?y, 0 0 
Th E) 0 yx 0 i 
0 0 0 yy 


i 


which must satisfy Vf + 52 > 0 whence x+y+ > 1/4and x-y- > 1/4. By choos- 
ing the remaining scaling parameter y such that y? x- = y7? y_ one gets that all 
four eigenvalues are > 1/2 and thus that Vj > 114/2. This means that the two-mode 
Gaussian state pj corresponding to such a correlation matrix has a P-representation 
(5.117) with a positive phase-space function Ro (rg) and is thus separable according 
to Example 5.5.3. Observe that this fact does not allows one to directly infer that 
also the state pġ with correlation matrix Vj is separable; indeed, the diagonalization 
of V) has been obtained by non-local rotations involving both sub-systems. How- 
ever, using (5.151) in Remark 5.5.3, the positive phase-space distribution Ro (rg) is 
obtained from the function Rọ (rg) relative to the P-representation of p’ by means 
of a symplectic matrix S composed of a same rotation O in the q1,2 and p;,2 planes, 
so that ||r/jl] = ST r4 |]. It thus follows that Rj (rj) = Ro(S~'r5) = 0, whence po 
and thus p are separable. 
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If 13 = 0, let yı > y2 = 0 and choose x? = /a/f, y? = 2,/a3 so that 


2a? 0 2y/aB 0 
P 0 1/2 0 0 


= 2yıvaß 0 267 0 
0 0 0 1/2 


Similarly as before, one checks that VY > +50 => VY >+—. 


6.2.5 Positive Maps and Semigroups 


We have seen in Sect. 6.2 that positive but not completely positive maps cannot be 
directly used as mathematical descriptions of fully consistent state-transformations; 
however, they play a major role as entanglement witnesses (see Proposition 6.2.1). 
Unfortunately, since there are no general rules that allows one to identify positive 
maps, only particular instances of them can be provided [85, 104—106, 136,213]. 
The one which follows assumes the existence of just one negative eigenvalue in 
decompositions as in (5.43) that is smaller in absolute value than all the other ones 
[49]. 


Proposition 6.2.2 Let {Lie be a Hilbert-Schmidt ONB in Mq(C) and A: 
Ma(C) —> Ma(C) a positive map with a decomposition 


d2 
A[X] = Do &e LEX Le, X € Ma(C), 
k=1 


where 0 < £i < €j41 for i > 2 while €; = —|€1| < 0, with |£] < 2. If Wil < 
1 = |L? 


Zz | then A is positive. 
1 


lo 


Proof The matrices Lg form a Hilbert-Schmidt ONB, thus, using (5.32), it turns out 
that, for all normalized w, ¢ € C4, 


d? d? 


3 2 
E [wiztlof =X wiee Led) = WITH) oD) = 1. 


k=1 k=1 


Then, since ||L ; |? = Li |? < TEIL) = | (see Remark 5.2.5), it follows that 


| 2 2 


= leal [ILI 


a a 
wi( oe LIONI: i > hy wiz ple) 
k=1 k=2 
Hel 2 2 
=b- (C+ lei) Wo] = ka = WEI?) = ellit? 
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uisa c 
EAP 


w, ġ € C4, whence A is positive. 


, then it follows that (4| A[| ¢)(@|]) w = 0 forall normalized 


In Remark 5.6.4 it has been stressed that, apart from the fact that the Kossakowski 
matrix C = [C;;] cannot be positive, there are no general prescriptions on C such 
that the corresponding semigroup surely consist of positive, but not CP maps; as well 
as for positive maps, one can however seek sufficient conditions. 

In the following, we shall consider a system consisting of two d-level systems 
Sa and construct [49] a semigroup of positive, but not CP maps I; = yl Cs) y2 on 
Ma(C) & Ma (C), where af? = exp(t Lı) is a semigroup of CP maps, while ny 
is a semigroup of positive, but not CP maps. The construction will also provide 
non-decomposable positive maps able to witness bound entangled states within a 
particular class of bipartite states with d = 4 [50]. We shall consider generators as 
in Proposition 5.6.1 without asking for the positivity of the Kossakowski matrix. 


Proposition 6.2.3 Suppose aí 1:2) , Ma(C) + Ma(C) to be semigroups with gener- 
ators 
oe l , 1 , 
n=O, x+ D PP xe? — s{GPy. x}), fer, 
c=1 
fori = 1,2, where Gg = (Gy € Ma(©), together with Go = 1//d, form two 
Hilbert-Schmidt ONBs in Mq(C). 

Assume cP > 0) fH 1D rrn d? — 1, and eo = —|c| < 0, for one index k, 
while cP > Ofor£ # k. Then, the semigroups of maps T; = a? Q ny?) on Ma (C) ® 
Ma(C) preserves positivity if cP > Eee €=1,2,...,d7—1 and cP > Ic,” 
£=1,2,...,d7-1, £24k. 


Proof According to [211,212] (see also [80]), in order to show that the semigroup 
{T;}r>0, with generator L = LY @ idy + idg ® L™, consists of positive maps, it 
is sufficient to prove that 


TH, p) = (HILO) Olly) 2 0 
for all orthogonal w, œ € C4 @ C4. Since (y| $) = 0, it follows that 


d?—1 d?—1 


2 
10,8) = Yet wP ong) + D e (bli @ GP ig) 


f=1 l=1 


2 


6 If the two semigroups af) were the same, then, according to Proposition 5.6.1 and the successive 
remark, I positive would mean J; CP. 
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It proves convenient to define the following d? x d? matrices W = [1);;] j] and 6 = 
[bij] where pij and ¢;; are the components of the vectors Y% and @ with respect to a 
fixed ONB {|i, iy, -in C4 @ C2. Then, one rewrites 


al d-1 

2 2 

IA = Y eP TG, ov] + > ef? miGPwtey" 
f=1 f=1 


a-l 

2 

= = See Pp mGMeuh| + + D e [ireua] 
k#l=1 


d?—1 

2 2 

EIS [rc ow] 2 [re wto) 
t= 


As (p| $) = 0, the matrices W* and W'@ are traceless; using the ONBs consisting 
of the matrices GY = (GOY, | ial PAEAS a= 1, one thus gets 


d?—1 d?—1 
Y GP oyt) = t((X Tr(G ow") G{) ou") 
f=1 t51 
=Tr(OW') = Trwte) = Tr(wt@)?)? 
d-1 
=) mG wie). 
{=t} 
This yields 


d?—1 d?—1 

2 2 

TGQ)" | < Tra, (ow")| +> HGP wio) 
t=1 k#l=1 


whence one concludes 


1h, 6) > Ter- ep meP evf 


= 1 
+ Ve? - Pp [TGP wey") > 0. 
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Example 6.2.1 ([49]) Let d = 2 and og, a = 0, 1, 2, 3, be the Pauli matrices plus 
the 2 x 2 identity matrix oo. Let Sa : M2 (C) > M2(C) be the completely positive 
map X > SalX] = oaX Gq, and set 


3 3 
1 1 
Xe z 2 alx] , Xe z 2, CaSalX] À 


where £a = | when a Æ 2, whereas £2 = —1. The first map amounts to the trace 
map Tr? (see (5.32)), while the second one corresponds to the transposition Tz with 
respect to the basis of eigenvectors of o3: indeed, it changes o2 into — o7 and leaves 
all other Pauli matrices unchanged. According to Proposition 6.2.2, it is positive but 
not CP, for Aag = diag(1, 1, —1, 1) and leall? =Í. 

Consider generators L1 2 as in Proposition 6.2.3 with d = 2, Fi = o; [V2 and 
choose as Kossakowski matrices 


100 100 
cX=[010], c%=]0-10 
001 001 
Then, the corresponding master equations read 
a lis 
) = ; 
aol] _ = Llo sÈ Sila 3p) 


The second one has been considered in Example 5.6.6 and generates a positive 
semigroup such that 


ea! 21 ee tae! 
Vt lol = 5 1+ pioi +e 7 p202 + p303) = p+ z F202 - 
Since L;[o;] = —20; while L1[o9] = 0, the solutions of the first master equation 
are the following CPU maps, 
It+e~p-o0 1-—e% 
1 E 
1 lel = = +e p. 


2 2 


Let idn, Tr, and T, denote identity, trace and transposition operations on (C2); 
since 1 = Tr(p) and p — T[p] = p202, one rewrites 


_ e72 1 e72 _ e72 
Tr, yP = = idz + To. 


(a) —2t.; 
=e “id 
Vt oF 5 
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As T4 = T2 ® T2 and Tr2 o T2 = Tro, the tensor product maps l; = yl ® y2 can be 
recast in the form 


1 —2t j= —4t 
hae ei 
2 4 
ooo 
r! 
i= e7% j= e72 
fe —— (em Bid + ——Th 8 ida) oTa. (6.22) 


~ 


T? 


The semigroup {J;};>0 consists of positive maps because the chosen Kossakowski 
matrices satisfy the sufficient condition of Proposition 6.2.3; moreover, the maps I; 
are of the form T; = T! + T? o Tg. It turns out that T,! is completely positive for 
allt > 0 for it is the sum of tensor products of completely positive maps. If T? were 
also CP, each map I; would then be decomposable; in order to check whether this 
is true or not, consider the Choi matrix id4 ® A; [P$], where 


—2t 


=f $ l-e i 
A, :=e “Tz id? + ~; la Bid, . 


Fixing a basis {|0) , |1)} € C? and writing 124) = 5 saa lab) ® |ab), one explic- 
itly computes 


id4 Q 4;[P$] = 


(1+ e774) P? 0 0 0 
0 (l—e77)P? 2e- P? 0 
0 2e-*P? (1 —e-*) P? oo o 
0 0 0 (l+e~7/)P? 


This 16 x 16 matrix has eigenvalue 0 with eigenvectors 


(|¥;),0,0,0), 0, 


Wj) .0, 0), (0,0, 


ô). o, (0,0,0, 


Gi), j =2,3,4; 
where |) are the Bell states orthogonal to |Woo), while 


(\Yoo), 0,0, 0) , (0, 0, 0, |Woo)) , (0, |Yoo), 1200), 0) 


are eigenvectors relative to the positive eigenvalue (1 + e~7’)/./2. More interesting 
_ 3—2 7 i 
is the last eigenvalue B with eigenvector (0, |Woo), —|Woo), 0): it is positive 


only if t > t* = (log 3)/2. It follows that T; is surely decomposable for t = 0 (Tù = 
idj6) and for t > t*. 
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In order to ascertain whether the positive maps J; constructed in the previous 
example are not decomposable for 0 < t < t*, we need some further insight. Indeed, 
the decomposition (6.22) need not be unique and there might be other decompositions 
revealing that T; is decomposable for all £ > 0. In order to proceed, we use the 
following result. 


Lemma 6.2.2 ([49]) Let A: Ma, (C) > Ma, (C) be a positive map and p a PPT 
state of a bipartite system Sı + S2. IfT(ida, ® ALPS! ] p) < 0, the state p is bound- 
entangled and A not decomposable. 


Proof If A is decomposable, so is its dual AT : Ma, (C) Ma, (C); indeed, 
= T _ AT Toafa A 
A= A, + A20Ta => A = Aj +Ta 0 AZ = Ay + A20Tg, , 


where AT is CP for it is the dual of a CP map, while A» := T4 © AF o Ta, is also 
CP as the corresponding Choi matrix is positive. In fact, 


ida ® A[E®] = 5° Bf @ (Ta, 0 AZ EY = Tad © 5° ef @ AT [E T 
i,j=1 i,j=1 


= Taa 0 (idg, ® AMEP] > 0, 


where Taja = Ta, © Ta, is the transposition on Mg,a,(C) = Ma, (C) ® Ma, (C). It 
preserves the positivity of the Choi matrix ida, ® AT[E®?] associated with the CP 


map AS . Since p is assumed to be PPT, if A is decomposable or p separable, then 
ida, ® A’ [p] > 0. 


To make good use of the above result, it is convenient to introduce a particular class 
[43,48] of 16-dimensional density matrices as states of a bipartite system consisting 
of two pairs of qubits; their structure is simple, yet flexible enough to represent an 
interesting setting where to test the decomposability of a wider range of positive 
maps A: M4(C)  Ma4(C). 


Example 6.2.2 Let S = Sı + S2 be abipartite system where S1 ,2 are each a two qubit 


system; consider the sub-class of 16 x 16 density matrices constructed by psoas 
to the pairs of the set Lig := {(a, DR, „6=0 the vectors |Wg) := (14 ® Top) |v} hs 


where [Pty i is the Bell state |W) in (5.174) and ogg := Ca ® og are tensor products 
of Pauli matrices with og = Ilz. The vectors |) form an ONB in C!6, 


“~ — 1 
(Wag| Wa) = (Wily ® Tapay P$) = g Caa) Togos) = bay9(36 . 
Given the corresponding orthogonal projections 


Pag i= |Wop)(Wagl = (id4 ® oap) P$ (id4 ® oap), Pag Pye = Say pe Pap » 
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consider the states consisting of all equidistributed convex combinations 


ao Pop , 


Lia, Byer 


where Z is a subset of L16 and Nz its cardinality. 

The behavior of such states under partial van poni can be deduced by means of 
the fact that the flip operator V = d idg ® Tg[P. a (5.35) is such that Ves) = |?) 
and V(A &® B)V = B @A forall A, B € Ma(O): while 


AQ B|?!) = dqyal TTESI jlAli) 17) 8B li) 


i j=l 


1 P 
= Ye L ain iia" y) =e BaTh. 
j=l i 


Then, setting Pag = id4 @T4[ Pag] = H4 8 dog V l4 ® ogg, it turns out that 


Pap E 


1 “~ 1 “~ 
a g @ oag V l4 @ agay VIES) = 470876 D capl T$) 


1 ~ ~ 
=3748 Tap (FapO75)" |P) = EacyEBES l4 @ CapTyirapl ES) 
1 


= zors [Wy] » 


where it has been used that oT = Eala With £a = 1 if a Æ 2, = —1 otherwise, that 
the algebra of the Pauli matrices implies 


and it has been set 


1 1 1 1 

z a a 1 1-1-1 

Oa0y0a = Nay Ty, Y= [Nay] = 1-1 1-1 
1-1-1 1 

1 1-1 1 

1 1 1-1 

Nay = Ea&yNay, N := [nag] = -1 1 1 1 
1—1 1 1 


The vectors |W5) are thus eigenvectors of Pog with eigenvalues 74735 and 


pride Talon = E (zy È nena) Pr 


(7,5)€L 16 Thee Bel 
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Since the matrix Nay is symmetric, the eigenvalues of p; can be recast as 


rare Nayn3s = (N x! M5 
1 aBel 


1 
where X is a sort of characteristic matrix of the sublattice J with entries X D = IN 
I 


if (u, v) € I, = 0 otherwise. Concretely, consider 


1 
p= = (Poo + Pii + Po3 + P31 + P32 + P33) : 


6 


iG, 1 
then, p = zn + Po2 + P29 + P22 + P32 + P33) for 


0010 0110 
1 {0100 1 {0000 

1_1 Toe 
X =zlo001|: 9 §=elioate 
O111 0011 


Thus, p is positive, hence p is PPT. 

We now show that Tr(id4 ® I;[ Poole) < Ofor0 < t < (log 3)/2, where J; is the 
positive semigroup of Example 6.2.1; by Lemma 6.2.2 it thus follows that p is PPT 
entangled and J; in decomposable in that time- metal 


1 
Since Tr2(-) = 5 2 Sal- ] and T2[ > Ep Sul -], the two semigroups in 
u= u= 
Example 6.2.1 can be recast in the form 


3 3 
1 1+3A 1-», 2 34+ 1-4, 
P = 4 So + 4 a y? = 4 So + 7 does. 


where À; := e7% , so that, with Sagl+] := Fag + Cap, 


3 
1+3A)G+A 1-—2A)G+A 
T,= (+ 3A;)(3 + P Sot ( NB +A) So 


16 16 = 
pOL 2 022A? 2 
3 
: (1 + 3A;)(3 + Ar) (1 — 2A;)(3 + Ap) 
d4 Q I.[Poo] = P P; 
id4 ® T;[ Poo] 16 00 + 16 0 
3 
pa + — 2X1) ar e > 
€; Pij. 


i,j=l1 
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It then turns out that 


— (=A) — 3Ar) 
= < 
48 


0. 


0 <t < (l0og3)/2 => Tr(idy ® I; [Pool p) 


6.2.6 Dissipative Entanglement Generation 


Usually, dissipative reduced dynamics as those addressed in Sect. 5.6.4 implemented 
by GKSL generators (5.213) lead to decoherence and loss of purity conforming to the 
fact that, in general, the non-negligible presence of an environment manifests itself 
through dissipation and noise. As decoherence is detrimental to quantum correlations, 
shielding quantum systems from sources of noise seems to be mandatory. However, it 
is not always necessarily so as it has been abundantly and repeatedly clarified [22,35, 
44,45,47,82, 188,202,284, 285,299, 320]: in fact, the environment can be engineered 
in order that the Kossakowski matrix which embodies its salient features might be 
such that entanglement can not only be generated out of initial separable states by the 
corresponding dissipative time-evolution, but also made to persist asymptotically in 
time. 

Consider an open quantum system consisting of two non-interacting qubits whose 
weak interaction with their environment leads to a reduced dissipative dynamics 
obeying the master equation 


Op(t) 
Ot 


= -i| H, p| + DO] =: LO), (6.23) 


where H = Hs + Hrs, with Hs the open system Hamiltonian in the absence of an 
environment. Assume the latter does not contain interaction terms between them, 
while they can be present in the Lamb-shift induced correction 


1 
His=-5 >) (45? Cio 8h) + A M @ cioj) + HY” (0i ® aj) i 
ij 
(6.24) 
Instead, let D be a purely dissipative contribution that mixes the degrees of freedom 
of the qubits, 


6 


1 
Mel= > Kag | Foe Fa = [Fool]. (6.25) 
a, ß=1 


where {- , -} denotes the anti-commutator and 
Fo i= 0a 8h, a=1,2,3; Fa := 2 @0g-3, a=4,5,6. (6.26) 


The Kossakowski matrix can be conveniently put in the form 


AB 
elne (6.27) 
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by means of the 3 x 3 matrices A = A’ > 0, C = CÌ > 0 and B. If B Æ 0, it gives 

rise to a mixing of the different qubit degrees of freedom that may thus be responsible 

for the dissipative generation of entanglement. Indeed, when 6 = 0 and H Ai =0, 

the dissipative dynamics factorizes, 
y= eL = e'l Q elle ; 

with single qubit generators L4, Lo. 

For two-qubit states p evolving according to a dynamics y, = e"”, the presence 
of entanglement is identified by the lack of positivity of the partial transposition (see 
Corollary 6.2.1) of the state 7;[p]. Therefore, the possibility of short-time dissipative 
entanglement generation amounts to checking whether there exists a separable initial 
two-qubit state vector | ~)(y| ® | w)(w| such that, up to first order in 0 < t < 1, 


TOME el@ld (HM = ly yl @ly* (Hl 
+1TOLE oy @1d)(vll 


is not positive, where | ~* ) { ¢)* | is obtained by transposition and thus by conjugating 
the entries of |7) with respect to the standard basis. Lack of positivity can be checked 
by seeking a state vector W € C4 orthogonal to |) @ |7)*) such that, as in the proof 
of Theorem 5.6.1, 


(HIT eSP AN) ~ 
~ 1(W (Le el @ly* yo NI) < (6.28) 


The map Č := T® o L o T® is such that [47] 
(WIL eel @ly*)(b* Iw) = 


6 
= $ Kop (¥ | Fal eel @ le (oF), 


@,p=1 


with 


j HCD 
A Re(B) +i H ) l (6.29) 


~ ee — i (H02) cT 


where H“? is the matrix formed by the two-qubit interaction coefficients in (6.24). 


Remark 6.2.3 Unlike the Kossakowski matrix in 6.27, K need not be positive semi- 
definite; indeed, the generator L gives rise to a semi-group that involves partial 
transpositions and thus does not in general consists of completely positive maps. 
Furthermore, the presence in the off-diagonal entries of K of the two-qubit inter- 
action matrix H“?) which is responsible for the dynamical creation of two-qubit 
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entanglement indicates that the real part of the mixing matrix B might also con- 
tribute to it. Interestingly, if it does, the entanglement due to it does not originate 
from interactions between the qubits, rather from the statistical mixing action dissi- 
patively operated by the environment. 


In order to study the ability of the environment to generate entanglement, the state |W) 
must be entangled, otherwise inequality (6.28) cannot hold. Indeed, letting | Y )( | 
evolve by duality, it would remain a state and the left hand side of 6.28 positive at 
all times t > 0. One can thus seek |) of the form 


IY) = Yar |e) @ |P") + Yar 1G) @ |v") + Yor IG) @ |B"), 630 


with respect to the two-qubit ortho-normal bases {|y), |~)} and {|w), |2)) }. Also, 
if the environment is not able to entangle any initial state of the form ps(0) = 
I~) (~| 8 |W) (yl, then, because of the semi-group structure of the dynamics, it surely 
can not entangle separable mixed states at either t = 0 or at any later time. Indeed, 
these latter states are, and cannot but evolve into, convex combinations of pure 
separable states. 

The chosen orthonormal basis can always be obtained by appropriately rotating 
the standard basis {|0), |1)} of eigenvectors of the Pauli matrix o3: 


lp) = UII), 1%) = U0), 
Iv) = VII), |v) = VIO), 


where U and V are unitary operators inducing orthogonal transformations U and V 
of the Pauli matrices: 


3 3 
U'oiU = Ñ Uijoj, VioiV =} Vijøj. 
j=1 j=1 


Then, as shown in [47], the semi-group y, = e’™ generates short-time entanglement 
in the initial state |) (Y| ® |) (vI, that is condition (6.28) is satisfied, if and only if 


(u 


, while |u) and |v) are 3-dimensional complex vectors with 


(u| Alu) (v le” v) 2 (Re(B) + ia | o), (6.31) 


B+B 


where Re(B) := 
components 


3 
u; = $ Uy (Olej|1), vi = $ Vy (1 |e;|0}. (6.32) 
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Remark 6.2.4 Short-time entanglement generation certainly depends on the initial 
two-qubit separable state; it can thus happen that a same dissipative dynamics is 
able to entangle only a subset of initially separable states. Furthermore, as already 
observed, the 3 x 3 matrix B which statistically mixes the two qubits can be respon- 
sible for the entanglement generation, as much as the Hamiltonian couplings of the 
two qubits. 


A particularly simple, yet physically interesting, condition, as shown in the case of 
the Unruh effect in [34], is when the two qubits have the same type of interaction 
with the environment In general, the interaction is of the form 


Him = Y` ((0281) @ 0 + (teoa) 802) , 


a=0 


where oo) are the bath operators that couple to the two qubit Pauli matrices. 
The coefficients Kag of the Kossakowski matrix are the Fourier transforms of the 
environment two-point time-correlation functions 


T(r Po PO), j,k=1,2, 


where pg is a suitable equilibrium state of the environment. Therefore, if piP = 
pP, then the Kossakowski matrix in (6.25) becomes 


AA 
ee) ; (6.33) 


with four 3x3 identical blocks A = [A;j]. Consequently, the two-qubit dissipative 
dynamics without feed-back is generated by the following master equation: 


3 
1 
Lott) = —i[H, o@))+ > Ay [z0x = (22; 0) | , (634) 


i,j=l 
where we have introduced the symmetric single qubit operators 
X =o @lb+h®o, i=1,2,3. 


Since the Kossakowski matrix K is positive semi-definite, such must also be the 
3 x 3 matrix A. The latter can be decomposed into the sum of its symmetric and 
anti-symmetric parts, 


T _ at 
A=A+B, am IHA Les), B= IE Im, 
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whereby the positivity of A implies that A is real-symmetric and positive itself. 
Furthermore, the anti-symmetric component B can be recast as 


3 
Aij = Ajj +i X eijrbr , bER. 
k=1 


In the absence of an entangling Hamiltonian, the presence of an anti-symmetric com- 

ponent of A is necessary for entanglement generation with a Kossakowski matrix 

as in (6.33). Indeed, with all H” = 0 in (6.24) and B = C = A = A, the condi- 

tion (6.31) cannot be fulfilled by any choice of initially separable pure states. In fact, 

using the Cauchy-Schwartz inequality and the fact that Re(A) = A, one obtains 
(ul Alu) (v |A] v) < |(u |A] v)|? < (uļAļu) (v [A] v) . 

On the contrary, even when all H. 2) 0, the presence of B guarantees that entangle- 


ment is generated for any initial pure separable state |p) © |y) that yields |u) = |v) 
in (6.31). Using that A = A + B and AT = A — B, condition (6.31) becomes 


(ul(A + B)|u) (u |(A — B)ļ u) — (u |A| u)? = —(ul Blu)? <0. 


This latter condition together with the form (6.33) is verified in the case of the 
dissipative behaviour induced on two uniformly accelerated qubits by the Unruh 
effect [34]. Indeed, a Rindler wedge appears in the Minskowksi space of two closely 
located, uniformly accelerated qubits; they thus experience the Minkowski vacuum 
as a thermal equilibrium state and can couple to such a Bosonic environment which 
is then able to entangle them at short times. Also, because of the high symmetry 
embodied in the matrix (6.33) the structure of the asymptotic states of the dissipative 
dynamics can be controlled and shown to make the entanglement generated at short 
times persist asymptotically in time in some cases [34,36,37]. 


6.3 Relative Entropy 


A notion directly related to the von Neumann entropy with several useful applications 
in quantum information and of great importance for the topics discussed later in the 
book, is the quantum relative entropy. It is the quantum counterpart of the Kullbach- 
Leibler distance (see (2.94)) and has already been introduced in the proof of some 
properties of the von Neumann entropy (see Proposition (5.5.7)). 


Definition 6.3.1 (Relative Entropy) Let S be a quantum system described by a d- 
dimensional Hilbert space H and p, o € i (H) two density matrices acting on H. 
The relative entropy of p with respect to ø is 


Tr(p (log p—log o)) if Ker(o) C Ker(p) 
+00 otherwise , 


S(p; o) := | (6.35) 
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Ker(c) and Ker(p) being the subspaces where o and p vanish. 


Example 6.3.1 (Relative modular operator [267,280,308,309]) Given the Hilbert 
space H = C4, equip the algebra M4(C) with the Hilbert-Schmidt scalar product 
(5.28) and denote it as << -, - >>. Then, consider the following linear operators 
on Mg(C): 


Lx[Y]=XY, Rxl[Y]=YX. 
They commute with each other, Ly Rz[Y] = X Y Z = RzL x[Y]; moreover, if X = 
Xİ they are self-adjoint. Indeed, the ciclycity of the trace operation yields 


Bay Lx[Z] >> = Tr(¥* x Z) = Tr(cx*)t z) =se Lx] Z >s 
<< Y, Rx[Z] >> = 1 Oa Zx) = T(t z) =<< Rx], Z >> - 
Further, if X > 0 the operators Ly and Ry turn out to be positive; in fact, 
<< Z, Lx[Z]>> = T(z" x z) >0 
<< Z, Rx[Z] >=> = Tr(Z' zx) =Tr(z x2") >0. 


Let P; and P; be orthogonal projections; then, Lp,Lp, = 0;j;Lp, and Rp, Rp, = 
ðijRp,. As a consequence, from the spectral representation X = X t= 
ya 1 Xi | xi )( x; |, one derives the spectral representations 


d d 
Lx =} xi Laya Ry =} xi Raitt. 
i=1 i=l 


with orthogonal projections {L| x; y(x; Hi respectively {Rj x; y(x; es Suppose p € 
B i (H) is strictly positive, then R,-1 = Ry! is well defined as well as the relative 


modular operator of p and o € Bi ŒH), 


d 
Apo := Lo R7! = R7! Lo= = ae Li 5; ) (5; Rir; Mrjlo (6.36) 
i,j=1 
where the spectralizations p = ae rjlrj)(rj| and o = 54i sil si} (si | have 
been used. If both p and ø are strictly positive, the same is true of their relative 
modular operator: for all X € Ma (C), 


d 
22% Apos Te( D sry! XT si 1X) (7; ) 
i,j=1 
2 
sry" [(rj | XT |s; |" = 0 


i,j=l1 
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Also, 


d 


log Apc = log Lo + log = > (los si Ly 5;)(5;| — log ri Rin; y(r i) ‘ 
i=1 


This yields the following expression for the relative entropy 


<< /p, —(log Apo) L/P] >>= S(p, ©). 


Of the following properties of the relative entropy, joint convexity and monotonic- 
ity under CPU maps have already been used in the proof of the properties (5.172) 
and (5.173) of the von Neumann entropy in Proposition 5.5.7. 


Proposition 6.3.1 The relative entropy of p, o € Bı (H) is 


1. positive: S(p; o) > 0, S(p; o) =0 iff p = 0; 
2. jointly convex: given weights ; > 0, i € I, X ier Ài = 1, and density matrices 
pi, oi € BID, i eI, 


s(x, Txa) < ONS (i, oF) ; (6.37) 


iel ie] iel 


3. invariant under unitary maps: let p, 0 € : (H) and U : Ht Ha unitary map, 
then 


S(UpU' , UcU") = S (p, o) . (6.38) 


4. monotonically decreasing under trace-preserving CPmaps: let p, o € Bi (H) and 
F : BY H) & Bf (H) be a CP map such that Tr(F[p]) = Tr(p), then 


S ŒI], Flo]) < S(p, 0). (6.39) 


Writing F[p] = p o E, where E : B(H) > B (H) is the dual CPU map of F, mono- 
tonicity reads 


S(poE, oo0oE) < S(p, 0). (6.40) 


Also, joint convexity is equivalently expressed by the inequality 
(Da. ba] sY AS GeT (6.41) 
iel iel iel 


where p; := A; pi and Gj := Aici. 
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Proof e Positivity: by means of the eigenvalues r j, sx (repeated according to their 
multiplicities) and of the eigenbases |r jh ls) of p, respectively o, one computes 


S(p, 0) = $ rilogri — (rj | loge |r; )) 


= Ð ni(logr: — YI (ri| s0) logs) 
i k 

> So ri(log ri = log sx | (il sx) P) 
i k 

= So ridlogr; — log((r; |0 Ir )) 


> Vi -(niloli)) =0. 


The first inequality follows from >°, | (ri| sx) |? = 1 and the concavity of the loga- 
rithm, while the second one is a consequence of the concavity of n(x) = —x log x, 
O <x < 1 (see (2.85) in Sect.2.4.3), which also implies that equality only holds 
when the eigenvalues and thus the density matrices coincide. 

e Joint convexity: we shall establish (6.41). Notice that, for all w > 0, 


l fa : k ) 
— log w = — — 
g 0 w+t 1+t 


=a-w f a( : ae eee, ) 
E 0 (w+d)d+h) A+A +0? 


B % dt (w-— 1)? 
=a-w+ f (+n? wit 


Then, use the spectral representation of the relative modular operator (6.36) to insert 
Ap,o in the place of w, act with — log A,,, on p and take the trace of the resulting 


matrix. Since Te(a — Ap.o)lPl) = Tr(p — a) = 0, it follows 


dt 
Tr 
(+1)? 


S(p; 0) = —Tr(log patel) = f (= Ayo) [p-ol). 


1 
Apo +tll 
Further, setting Y = 1 and X = (Apa + tll)—![p —o]in 


<< Y, (I — Apo)[X] >> = << Y , (Rp — Lo)R,'[X] >> 


= << (Rp — Lo)[Y], R7'[X] >>, 
yields 
s 5 em : 6.42 
.0) =f ee r((p VEY ol). (6.42) 
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Let now pj; and o; be as in (6.41) and 
Xj := (Le, + tR) Ip; — 5] — (La, +tRz,)'’ IB], 


with B = BÝ € M4(C) to be defined later. Then, since the various operators are 
self-adjoint with respect to the Hilbert-Schmidt scalar product, by observing that 
>F (Le, + tRp;) = Lo + t Rp, one obtains 


0< 5 << Xj, Xj >>=) << pj- Õj, (La; + tR) [ay] >> 
j j 
— << p—0, B >>- << B, p-o >>+<< B, (Lo+tR,)[B] >> . 


By choosing B = (Lo + tR)! [p], one gets the inequality 


So << pj — Fj, (Le, +tRp,) Aj- 5j] >>= 
j 
=P T(G; - (La, + tR) IB; -— 3,1) 
j 
> Tr((p — o)(La, + tR) — 4), 
which, once inserted in (6.42), yields the result. 
e Invariance: it follows from the fact that log(U pU*) = U (log p)UÌ and that the 


same holds for ø. 
e Monotonicity: it is implied by joint convexity. We shall first show that 


5 (p1; 01) < S(pi2, 712) , 


where p12, 012 € i (H12) with marginal states p1 2 = Tr2,1 p12, respectively o1 ,2 = 
Tr2,1012. Let d; = dim (H;), fix an ONB {| j Ia ;in Hz and define the unitary matri- 


fn 

ces Up € Ma, (C), € = 1,2, ..., do, with entries (Up) jx = dx exp (jo. Then, 
2 

for all X € Ma, (©), 


d2 d2 
1 1 iek), . . 
P(X] = FP UXU =F YY e ON GX IYI NE 
t=1 £, j,k=1 


d2 
=J IJJI IXI); 
j=l 
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whence pı = id; ® [p12] and similarly for ø. Furthermore, using the basis 
+ yd . do 1 ; 
UD to write p12 = a es ven @|j)(k|, then 


S(pj, 71) =S sf (1) Le) <6 Mg at) 
j=l 
=S Senu reenn ji 


=5 åa; 8 P[pi2], idı ® aicil 
ia $ 
Žo nA S (hi 8 Vdp ® Ue)", di 8 Udon & Vot) 
2 (=I 
= S (p12, 012) , 


where the second equality follows from the orthogonality of the matrices contributing 
to the sums, the last equality follows since the matrices Il; ® Z, are unitary and 
because of the invariance of the relative entropy, while the two inequalities are a 
consequence of its joint convexity. 

Monotonicity under trace preserving CP maps thus results from Remark 5.6.3, 


by writing E[p] = Tre(U (p ® pr) u’): 


S(E[p], E[o]) < S (U (P 8 pg)U', U (0 8 pgz)U") 
=S(p@pe,7@pe)=S(p,o). 


Example 6.3.2 As an application of joint convexity, let p € Bı (H) be a density 
matrix describing a statistical mixture {Aj;, pij}: p = ij Aij ij ii Aij = 1. Set- 


ting Pij := Aij pij, D! = >; Pij and D := J; Pij, one derives 


> Ai S (Pij, P ; = Trpij log pi; — Trp} log p — 2 Aij log rij 
j 


as (Bi; D) + 8 (2; p) -Ewen 
j 
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whence (6.41) applied with reference to the sum over the index i and the fact that 
>)! = p yield 


Daso = D8 (4) + ESPA - EAs% 
ij j i ij 
= 25 (#3. 9) +5 s(a. p) 
j i 


+ X Allogàl + D> AF log AZ — Y Ajj log rij , 
i j ij 


m ~2 

Pi Pj 

where Al = Xy ; dj = Xy : pl = x ; pj ‘= 2 ; 
j i i 


The physical interpretation of the relative entropy comes from thermodynamics: 
there, it amounts to free energy. As such, it can only decrease under dissipative 
time-evolutions [224,338]. Let o in (6.35) be the Gibbs state at inverse temperature 
b= T! with respect to a Hamiltonian operator H € B(H) (see (5.187)): 


o = pg := Zg exp(—pH) , ag = Tr(exp(—(A)) . 


The free energy of a state p € Bı (H) is 


F(p) := TSH) —(H)p, (H)p = Tro H). 


From F (p8) = —T log Zg it follows that 


S(p; pp) = ~S(p) — log Zg + BIH), = B(F (06) — F(p)) . 


Let S undergo an irreversible evolution described by a quantum dynamical semigroup 
W : S(S) > S(S), t > 0, with pg an equilibrium state, namely 7;[e3] = pg. Then, 
as seen in Chap. 3, the dynamical maps 7 are completely positive and fulfill 7; = 
Yt—s © Ys, t > s. Thus, monotonicity (6.39) yields 


S(rulol; Lost) = S(lels pe) = AF p) — Filed) 
= S(1-s ysl]; 1-sl031) 
< S(aslAl; pa) = BF) - FOstel)) (643) 


While the relative entropy behaves monotonically, this is not true of the von Neu- 
mann entropy (see Example 5.6.3). For instance, if the quantum dynamical semi- 
group mentioned before represents the reduced dynamics of a quantum open system 
S interacting with a reservoir, the free energy of an initial state may decrease in time, 
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showing tendency to equilibrium, while but its von Neumann entropy may in some 
cases decrease (for more details see [57]). The following example provide a class 
of dynamical operations on the states of S which always increase the entropy of its 
states (or keep it constant). 


Examples 6.3.1 


1. Bistochastic maps [355,356] Completely positive unital maps E : B(H) hb 


B(H) are called bistochastic if their dual maps F : S(S) > S(S) preserve the 


: ll ll 
tracial state: F | — | = —. 
d d 


The a. natural bistochastic maps are those associated with projective POVMs 
, Fpl[el =>); PipPi, PiP; = ôij Pj, >; Pi = 1 (see Sect. 5.6.2). These maps 
a increase the von Neumann entropy; indeed, from (6.39) 


s (ia. | >|) = logd — S(F[p]) < s(o. z) = logd — S(p) . 


. Let P bea projective POVM as in the previous point. The linear span of the orthog- 
onal projectors P;, i € I (not necessarily one-dimensional, so that card(/) < d) 
is an Abelian subalgebra Ap C B(H) with identity, whose typical elements have 
the form a = ye 1 4i Pi. The space of states u over Ap consists of normalized, 
positive linear expectations 


Ap >a pula) = Xai u(P;). 


iel 


They thus correspond to all possible discrete probability distributions with card (Z) 
elements. Given a state p € S(S) its restriction to Ap, denoted by pļAp, corre- 
sponds to the discrete probability distribution up := {Tr(p Pi)}ie1. It thus follows 
that p]JAp = Fp[p] = Jier Pip Pi, indeed 


Tr(Fplpl a) = Y ajTr(PipP; Pj) = Ya; Tr(p Pi) - 


i,jel icl 


From the previous point, it then follows that 


S(p) = min S(p1A) : A C B(H) Abelian with identity} l 


the minimum being achieved at any Abelian subalgebra A generated by the eigen- 
projectors P; = |r; }(r; | of p, for in this case pp(Pi) = Tr(p| ri )(ri |) = ri. 


The following result emphasizes the connections between von Neumann entropy 


and relative entropy [255]; the idea is to exploit the (infinitely many) convex decom- 
positions of mixed states. 
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Proposition 6.3.2 Let S be a quantum system described by a Hilbert space H and 
let p = Dic Aipi, Ài = 0, Diel Aà; = 1, be any convex decomposition of a mixed 
state p € S(S) in terms of other density matrices pi € S(S). Then, 


S(p) = minf YA; S(pis p) : p=F Nal 


iel icl 


Proof From (6.35), ÈO A S(pi; P) = S(p) — CASC) < S(p), while the 
icl icl 
spectral eigenprojectors of p give the upper bound. 


6.3.1 Holevo’s Bound and the Entropy of a Subalgebra 


As seen in Example 6.1.2, by encoding classical information into non-orthogonal 
quantum states one may always detect the presence of eavesdroppers during trans- 
mission. However, the non-orthogonality of the quantum code-words does not allow 
for perfect retrieval of the encoded classical information, for no measurement can 
perfectly distinguish between non-orthogonal states as seen in Example 5.6.5. 

More in general, the symbols i € I4 = {1,2,..., a} of a classical alphabet, emit- 
ted with probabilities p1, p2,..., Pa, might be encoded by means of mixed states 
pi € B, (A). Or, from a more realistic viewpoint, given an encoding of the clas- 
sical symbols into pure states ~; € H, a noisy transmission channel might trans- 
form them into mixed states p; := F[| v; )( Yi |], where F is the dual of a CPU 
map E : BCH) — BCH). The receiver must then reconstruct the encoded classical 
oe with the least possible error; practically speaking, he must seek a POVM 

= {Bisicz, C BCH), Ig = {1, 2,..., b}, such that, when measured on the statis- 

F mixture p = } 4e I, Papa, it maximizes the accessible information. 

In such a context, three random variables appear: A, B, and A V B, with proba- 
bility distributions 74, mg and TAvBg: 


1. the outcomes of A correspond to the indices i € 74 of the incoming states and 
TA = {Pahael,; 

2. the outcomes of B correspond to the indices i € Ig of the POVM and 7g = 
{Tr(p Bisier,; 

3. the outcomes of A V B correspond to the joint events consisting of an incoming 


state pq and a measured index i: tavz = | PaTt (pa B)| . 
aél,,iclp 


According to Sect.2.4.5, the mutual information /(A, B) measures how much 
knowledge one gains about A, that is about which state p; has reached Bob, from 
measuring on B the POVM B = {Bi }ier,: 
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I(A, B) = H(A) + H(B) — H(A V B) 


= — YP palog pa — D> (Tr(p Bi) log(Tr(p Bi) 


acla icIg 
+) 9 pa(Tr(pa Bi)) log(pa(Tr(pa Bi))) 
acla i€lg 
= — } (Tr( Bi) log(Tr(p Bi)) 
icIg 
+ J pa X (Tr(pa Bi)) log(Tr(pa Bi) . (6.44) 
acla icIg 


In the classical case, perfect knowledge of A from knowing B can be achieved by 
choosing B such that H(A|B) = 0; in the quantum case, there is a more stringent 
upper bound on 7 (A, B) that depends on the given decomposition p = } 4e 1, Papa 


and is denoted by x (p, {PaPahaer,): 


Proposition 6.3.3 (Holevo’s Bound) Given p= } e I, Pa Pa € Bi (H) and the 
POVM B = {Bi}icrg < BŒ, 


ICA, B) < x (p, {P; PaPalact,) = SP) — X Pa S(pa) - (6.45) 


acla 


Proof ([25]) Given the POVM B = {Bihietg S BH), let B = {b;}jeln be an 
Abelian algebra with minimal projections bj, bib; = Ôijbi, D rels b; = Ig. It can 
be embedded into B(H) (as a linear space) by means of the linear maps yg : Bb 
BH) such that yg [bi] = B;. Positive operators in 5 are of the form b = 2: Šip bi bi : 
Bi = 0, therefore yg(b) = Leh i Bj = 0 so that yg is a positive map | and, 


because of Example 5.2.6.7, cee positive. Also, yg (Ig) = Jier ~y; )= 
Jie ds Bi = lla, whence yg is a CPU map. Furthermore, the states po yg and 
pi © yg On B are diagonal density matrices with eigenvalues {Tr(p Bi }ic7,, respec- 
tively {Tr(a Bi }icr,. Thus, (6.44) and the monotonicity of the relative entropy (6.39) 
yield 


I(A, B) = Ý pa S(p07B, pao 7B) < È pa SCP, pa) - 


acla acla 
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Remarks 6.3.1 


1. Using (5.166) one derives that 


X (P, {PaPalacta) = S(O) — > pa (pi) < — >> palog pa = H(A). 


acla acla 


Thus, if (6.45) is a strict inequality, perfect reconstruction of A upon knowledge 
of B is not possible; on the other hand, the inequality is strict unless the states pa 
are orthogonal to each other and thus perfectly distinguishable. 

2. A consequence of the Holevo’s bound is that any quantum encoding of n bits 
into n non-orthogonal qubits states |w;) € C?” achieves secure transmission, but 
cannot transfer more than H(A) < n bits of information (when the entropy is 
expressed in base 2). 

3. Whether the upper bound is achieved or not depends on the ability on the part 
of the receiver to find one or more optimal detection strategies, namely those 
POVM’s B that maximize [(A, B). As we shall see this is a remarkably difficult 
analytical problem even in low dimension. 

4. By taking the supremum over all possible POVM’s B that the receiver may devise 
as detection strategies, one defines the maximal accessible information as 


I(A) := ant, B) < x (p, {Papahact,) - (6.46) 


Example 6.3.2 Suppose A transmits the bits O and 1 to B by encoding them into 
the non-orthogonal states 


+) = yp lvi) + y1- p lyn) EC, 0<p<1, (ıl y2) =0, 


chosen with equal probability. The statistics of the encoded quantum signals is thus 
described by 


1 1 
Ma(C) > p= 51+)(+| T ribet l= gee Gael + -= p)l¥2)( %2], 


and x({p, Apj}) = H2(p) = —p log p + —(1 — p) log(1 — p) reaches its max- 
imum of 1 bit of transmissible information only when p = 1/2 so that (+| —) = 
1—2p=0. 


The problem of achieving the maximal accessible information Z (A) appeared 
earlier than in quantum information, in relation to finding optimal decompositions 
achieving the so-called entropy of a subalgebra [108,255], the building block for 
constructing a particular quantum extension of the KS entropy to be discussed later 
in Chap. 8. 

Let M C B(H) be a finite-dimensional subalgebra with identity, p € Bi (H) a 
state on B(H) and po}M the state on M which results from restricting p to act (as an 
expectation) on the observables in M, only. 
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Example 6.3.3 Let A C B(H) be an Abelian subalgebra with k < dim(H) minimal 
projectors a; and p € B i (H) a density matrix; then, pļA amounts to the classical 
probability distribution 74 = {Tr(pai)}_,. 


Definition 6.3.2 (Entropy of a Subalgebra) Let M C B(H) be a subalgebra and 
pe Bi (H) a state; the entropy of M relative to p is 


Hp (M):= sup J ASUM, pM) (6.47) 
P=Lier ÀiPi iel 
= S(p)M)- inf AiS (pM) , (6.48) 
P=} ic Ài Pi 2 


where the sup in (6.47) and the inf in (6.48) are taken with respect to all possible 
linear convex decompositions p = } jez Ài pi- 


It must be stressed that, unlike in Proposition 6.3.2, the decomposition comes first 
and the restriction to the subalgebra only afterwards; this is what makes the explicit 
computation of the entropy of a subalgebra a complicated variational problem, in 
general. Luckily, there are particular instances of states and subalgebras where things 
are easier. 


Example 6.3.4 Consider the Abelian subalgebra A of Example 6.3.3 and let p € 
Bi (H) be a state which commutes with all elements of A. The following decompo- 
sition of p is optimal, 


JPG JP pa; 
Tr(paj) Trp) 


k 
p= XO Tr(pai) pi » Pi'= 


i=l 


Indeed, the restrictions pļA and p; JA are probability distributions 


’ 


Tr(paj @j) _ | 


k ; 
- [mea], h = [POR 
TA | (paj) jai TA | Tr(p@) 


j=l 
such that S (p; A) = 0 and 


k 


Hp (A) = S (PA) = — È` Tr(pGj) log Tr(pai) . 
i=l 


The simplest context in which the above argument applies is when, instead of BCH), 
one deals with an Abelian von Neumann algebra. Then, as seen in Sect.5.3.2, via 
the Gelfand transform, any finite subalgebra A C A has minimal projections which 
correspond to the characteristic functions of suitable measurable subsets of a measure 
space and thus identify a finite partition of the latter or, equivalently a random variable 
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A. Moreover, the state w becomes a probability measure u and gives a probability 
distribution over A such that the entropy of A yields the Shannon entropy of A: 
H, (A) = H(A). 

Instead, the simplest non-commutative application of the previous argument is 
when B(H) = Mq(C) and p = 14/d is the tracial state; then, 


k 
H, (A) = S(p]A) = H(A) =- ` 


i=1 


T) , _ Tr(a) 
log ; 
d d 


We now relate the variational problem in (6.47) to the one in (6.46). The clue is 
that POVMs as B = {Bi }icIg give rise to decompositions and vice versa, while the 
main technical tools are provided by the GNS construction 7, (B(H)) based on the 
state p. Of particular importance is the possibility of dealing with decompositions 
by means of positive elements in the commutant 77,( B(H))’ or even in TBH) 
itself (see Remark 5.3.2.3 and the relation in (5.156)), an instance of which already 
appears in the previous example. It follows that the probabilities in (6.44) can be 
recast as 


Sets |p) = =e p(X") (6.49) 


(2p | Tp(Bi) Xf, |2p) ' 
s Pa = (22)|X,|2o), 6.50 
Tr(p Bi) P ( pl al p) ( ) 


Tr(pa Bi) = 


pi (Xa) := 


where |2,) is the GNS cyclic vector and the X/, € Tp( BH)’, a € I4, are positive 
operators in the commutant such that )),<;, X4 = 1. 

It turns out that the linear functionals 7,(B(H))’ > X’ +> p(X’) are positive and 
normalized, hence states on the commutant, as well as 


p(X’) = Y (Trp Bi)) p(X") = (2p |X" |2p) . (6.51) 


icIg 


In analogy with the proof of the Holevo’s bound, let A = {ã;} jer, be an Abelian 
algebra (with identity) generated by minimal projections @j; and introduce the 
CPU map 74 : At TBD, y4 [aj] = x’ that sends A into the commutant 
TBH). Then, using (6.49) and (6.50), one sees that the states p'o y/, and 
p; o Y4» i € Ig, on A correspond to the probability distributions 7’, = {pa}aer, 
and m = {Tr (Ø; (X7) }acr4; whence (6.44) can be rewritten as 


I(A, B) = — È | (Tr(p Bi)) log(Tr(p B;)) 


icIg 


+X Y pa (Tr(pa Bi)) log (Tr (pa Bi)) 


acla iE€lg 


= > ( 5 (Tr(p Bi)) p; x) log pa 


icIg acla 
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+> (Tr(p Bi) D> p(X) log p;x) 


icIg acla 
=- J palog pa + X. Tro Bi)) X p;X4) log p; (X4) 
iclg iclp acla 
= S (P 074) — J (Tree Bi)) SC; o V4) - (6.52) 
ielz 


Therefore, the maximal accessible information relative to the encoding {p , Papa} 
equals the entropy of the CPU map 7’, : A +> m, (B(H)Y relative to the state p’ on the 
commutant: I (A) = H p A). If p is a faithful state, one can use (5.156) to substitute 
the X ‘ with elements of a POVM in B(H) and y4 with a CPU map y4 : A +> BŒ), 
so that Z (A) = Hy (ya). 

The natural embedding iy of a subalgebra M C B(H) into BCH) is a CPU map 
(see Examples 5.2.3.7 and 8) such that p[M = p o ım. This observation suggests the 
following extension of Definition 6.3.2. 


Definition 6.3.3 (Entropy of CPU maps) Given a completely positive unital map 
y: M | BC), where M is a finite-dimensional algebra, its entropy relative to a 
state p € i (H) is 


Hyp (7) = SUPp_y,-, io; Died Ài S (POY, PION) (6.53) 
= S (p 0 7) — inf =F; ipi Vie AS (i0). (6.54) 


Lemma 6.3.1 7. Given a CPU map y : M + B(H) from a finite dimensional alge- 
bra M into B(H), one has 


0 < Hp (7) < S (p 0 7) < logdim() , (6.55) 


where dim( M) is the dimension of any maximally Abelian subalgebra contained 
in M. 

2. If pis a faithful state, then H, (M) > O unless M is the trivial algebra, consisting 
only of multiples of the identity. 

3. Consider two finite dimensional algebras Mı 2 and two CPU maps 7, : Mı > 
M2, 72: M2 > BH), 


Hp (2 0 71) < Hp 2) - (6.56) 


In particular, if N C M C B(H) are two finite dimensional subalgebras, 


H, (N) < H; (M) . (6.57) 


Proof Positivity and boundedness are evident, monotonicity under CPU maps fol- 
lows from (6.40) applied to Definition 6.3.3, while monotonicity under algebraic 


334 6 Quantum Information Theory 


embeddings follows from considering the CPU maps consisting of the natural inclu- 
sions ım of M into BCA) and iyy of N into M: 


Hp (N) = Hp (ım 0 tnm) < Hp (tm) = Hp (M) . 


As regards the second property, suppose that H, (M) = 0, then, the first property of 
the relative entropy in Proposition 6.3.1 yields p}M = p;|{M for all decompositions 
P = jer Aipi- Then, consider the GNS representation of B(H) based on p and set 
p(M) := Tr(p M) = ( 2p | Ti(M) |2,), for all M € M. It follows that 


(Rp |X; TM) |2) = p(M) ( 2,| X; |2p) equivalently 
(2p 1X}(mp(M) — pM) 11) |2p) =0, 


for all0 < X} < 1 in the commutant m, (B(HI))’. Notice that any such X’ can be writ- 
ten as a sum of positive Il > X 1.2 E ToC B (HI))’; then, since p faithful on B(H) implies 


|2,) separating for 7,(B(H)) and thus cyclic for 7( (H) (see Lemma 5.3.1), it 
follows that (7,(M) — p(M)) |2 p) is orthogonal to a dense subset of the GNS Hilbert 
space H, whence, again from the faithfulness of p, M = p(M) 1 for all M € M. 


Example 6.3.5 The last result in Example 6.3.4 extends to subalgebras M C B(H) 
which are not Abelian but commute with the state p’ ; then 


H, (M) = S (p}A) , (6.58) 


where A is any maximally Abelian subalgebra contained in M. Indeed, from 
Example 6.3.1.2 and the first case discussed in Example 6.3.4 it follows that 
S (p|M) = S (pJA) = Hp (A), where A C M is maximally Abelian; on the other 
hand, from Lemma 6.3.1, one deduces that 


S (pM) = S (p]A) = H, (A) < H, (M) < S(p1M) . 


Apart for the simple cases discussed in Examples 6.3.4 and 6.3.5, the minimization 
of the linear convex combination of von Neumann entropies in (6.54) is in general 
an extremely difficult task. At first sight, one might even suspect to be forced to 
consider more than discrete convex decompositions of the state p; luckily, the fol- 
lowing result ensures that H, (M) can be reached within £ > 0, by means of discrete 


decompositions [108,264]. We shall denote by Ay. (W, pi) the argument of the 


supremum in (6.53) evaluated at a given decomposition p = je; Aipi, namely 


H} (i, piler) = EN S07, po) - (6.59) 


iel 


7 In such a case, one says that such M are contained in the centralizer of p. 
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Proposition 6.3.4 Let y : M+» B(H) be a CPU map from a finite dimensional 
algebra M into B(H) and p € Bi (H) a density matrix. Given a decomposition p = 
Jier Ae and E€ > 0, there exists a decomposition p = )) <j Ni p; where card(J) 
depends on dim(M) and £, such that 


Ses (6.60) 


CAC pilier) = H} (|X, Pi} sey) 


Proof Consider a finite partition Z = {Zj;}j<, of the state-space S(M) of M into 
subsets Z; such that 


012 E Zj => llo1-om|| <6 YZjez. 


For instance, card (J) can be chosen not larger than the least number of balls of radius 
6 that are necessary to cover S(M). Define 


N 
pi := ‘a yr Pi N= 5 Ài. 
J 


tel icl 
pioyEZj pioyEZj 


1 — f al 
By construction, p = }- jej Xp’; and 


< DOAS (0509) = Zas (o) 


iel Jed 


S (pio) — s (0) 


113 (O pici) = m (a) 


DP 


Jed iel 
pioyEZj 


By choosing ô appropriately, the result follows from the Fannes inequality (see 
(5.167)). 


From this result, it follows that, for any € > 0, there exists a decomposition p = 
2 ie g Ai pi With card(7) depending on dim(M) and £, such that 


H} (n pilier) = Hp) = €. (6.61) 


We shall call £-optimal for y the decompositions which achieve H, (y) within € > 0 
and optimal for y those decompositions p = È`; Ajpi such that 


Hy (7) =S (poy) -$ Ajs (Pi 07) - 
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6.3.2 Entropy of a Subalgebra and Entanglement of Formation 


In this section, we shall consider some techniques developed in [61—63] that are 
of help in calculating the entropy of a subalgebra H, (A) where A is a maximally 
Abelian (n-dimensional) subalgebra of a full matrix algebra M, (C). The first step is 
to extract from (6.48) the expression 


E,[M, M] := XOCAS iM) , (6.62) 


P=} icr Ài Pi jel 


where we have specified the state of the system, the total algebra of its observables M 
and the selected subalgebra M C M. Then, one notices that the variational problem 
can be solved by restricting to decompositions of p in terms of pure states; this is so 
for the von Neumann entropy is concave (see (5.166)). In fact, assume p = )> i Aj Pj 
optimal for M (so that E,[M, M] = + Ai S (pi \M)), with non-pure decomposers 
iel 
pj- Then, by further decomposing pj = $; Àjkp jk, one gets another decomposition 
P= J j AjAjke jr; thence, S (pj; 1M) > 5 AjkS (pjk ÌM) yields 
k 


EIM, M] < X` AjAjrS (pjkÌM) < 2s S (p;1M) = EIM, M]. 
j,k 


Notice that, since pure states P; cannot be decomposed, for them it holds that 
Ep,[M, M] =S (P; 1M) : (6.63) 
In a similar way, one shows that the functional E,[M,(C), M] is convex over the 


state space Bi (C”): given a convex combination p = }` j Yj Pj» the optimal decom- 
positions pj = >>, Ajkp jk that achieve 


Ep; [M (C), M] = È` XS (pjx1M) 
k 


for each j, provide a decomposition p = X- jkYj A jkp jk Which need not be optimal, 
whence l 


Ey, vjpj(Mn(C), M] < Ñ vjAjS (pjx1M) < dvs (0;1IM). (6.64) 
j,k 


We shall fix M = M, (C) for some n; the following results turn out to be useful 
[63]. 
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Proposition 6.3.5 For a fixed density matrix p € M, (C) and M C M,,(C©), 


1. there is an optimal decomposition consisting of no more than n? decomposers; 

2. the functional E |M, (C), M] is linear on the convex hull of the optimal decom- 
posers of p; namely, if p=} ;àiP; is an optimal decomposition for 
E,[Mn(C), M], where the P; are projections, then any other convex combina- 
tion p = }_; vj Pj, with weights vj > 0, )); vj = 1, is also optimal in the sense 
that, 


Ej[Mn(C), M] = dvs (P;1M) 


Proof The first statement results from a theorem of Caratheodory [21] since M, (C) 
is n? dimensional as a linear space and the set of pure states is compact [357] (see 
Remark 5.3.2.5). 

The second statement is a consequence of (6.63) and (6.64); indeed, as a convex 
functional, E,[M,(C), M] can be expressed as 


E [M, (C), M] = sup { A[p] : A affine functional on Bi (C™)} . 


Let E [M, (C), M] = Alp], then A[o] 
thus, given an optimal decomposition p 
projections, 


[M (C), M] for a different state c; 


< Eo 
= J; à; Pi, where 4; > 0 and the P; are 


E,[Mn(C), M] = 2A S (P; M) = 2 NEnl [M,(C), M] 
> aii AZA P;] = Ep[ M, (C), M] . 


Therefore, A[P;] = Ep, [M, (C), M] for all i; consequently, if = a vj Pj is any 
convex combination of these optimal projections, then 


Ez[Mn(C), M] < 2e (P; M) = 2 [Mn (C), M] = X vj AIP 
j 


ADI < EM, (©), m. 


Calculating E [M, (C), M] can be simplified if the state p enjoys symmetries 
that leave the subalgebra M invariant as a set; namely, suppose there exists a unitary 
matrix U : C” + C” such that T,[p] = U p UÝ = p and TTM] = M, where T7 : 
M,a (C) > M,(C) is the dual map of T„. Then, 


Proposition 6.3.6 Let E,[M,(C), M] be achieved at the optimal decomposition 
p = >); Xi Pi; then, the symmetry map T, gives other optimal decompositions. 
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Proof From p= Tlo] = 0; i Tul Pi] and Ty, [ PIM = Pi TUM] = PM it 
follows that 


E,[M, (C), M] < È A; S yLPIM) = X. A; S (PJM) = E,[M,(C), M] . 


Particularly suggestive instances of states p € B} TC? ) with ee are those 
that are permutation invariant with respect to a vei ONB {|i)}@ ;—1> they are of the 
form 


“ta + xi veda, (6.65) 


p® = =a +5 > |i) 


d fra 


1 
where |) : -È |i) so a ag ee <x<l. 


By defining F := hat | px |W+),0 < F < 1, one can rewrite px in a way which 
is directly comparable with the isotropic states (6.3): 


1-—F L F— 
ggm la + El) (bel, (6.66) 


2 
which we shall denote in the following by pa ) as they are obtained from (6.3) by 
changing d into d?; notice that 


(G4 |p O1G4) = (by pË Wr) =F. (6.67) 


Let m denote the d! permutations i œ> (i), 1 < i < d; it turns out that 


p2 - 
pr = Ad Unlondlus! $| Uz" (6.68) 
P7 
(o) 

where |ġ) € C4 is any vector such that | (Y4 | ¢) |? = F and Uz unitarily implements 
the permutation of the chosen ONB corresponding to 7. 

Let A denote the maximally Abelian subalgebra generated by the projections 
{|i} (i ge the decomposition (6.68) is such that 


< a5 (Psi) =sS(PZ1A) 
d 
-I(l j) Plog (l j)? =: r(F). (6.69) 


j=l 


E o[M4(C), A] 


A 
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Proposition 6.3.7 If ov is a permutation invariant state on Mg(C) and r(F) is a 
convex function of F € [0, 1], the decomposition (6.68) achieves EW [M_2(C), A]. 
F 


Proof Let pe = 0,1 Pi, Pi = | 4) ( Qi |, achieve E o [Ma (C), A] and consider 


d@)_ 1 + (d) 1 ; 
pY = 32 Un Pr Ur = A i 2 Un Pi Un . 
T i T 


P! 


i 


The states P“ are permutation invariant; according to (6.68) they are completely 
characterized by parameters F; that satisfy 


F = (4410F lp) = JOA (Ya | Ph ide) = ON 


Thus, Proposition 6.3.6, the assumed convexity of r (F) and (6.69) yield 


E a 
i 


[Ma(C), A] = $A; S (PHA) = DO rR) = r(F) 


= E a [Ma(©), A]. 
F 


1 = 
Examples 6.3.2 1. For d = 2, pP =5 oe. I = ') can be written as 
= = Toa? 
p? = l a T2 1 T Aa 


where a := 2y F (1 — F); then, with n(x) = —x log x and the notation of (6.12), 
l+a l-a 1+2/F( — F) 
oe an eae, 


2 2 
In order to use the previous proposition, we need show that r(F’) is convex on 
[0, 1]; for this we calculate 


d?r(F) 2 i l+a j 
= (0) a = 
dF2 a z l—a 


(6.70) 
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The function within the parenthesis is monotonically increasing from 0 to +00; 
the second derivative is thus non-negative and the function r (F) is convex. Then, 


14+2/FU-F 
E pm ©, Al = m (===) 


1+2VF(1— F 
H,2) (A) = log2 m( = k ) y 


where A is the Abelian subalgebra of diagonal 2 x 2 matrices. Notice that this is 
the only Abelian subalgebra in the d = 2 case: in [61] H, (A) has been computed 
for all states p € M2(C). 

2. Given a fixed ONB (eye. ,; in C4, consider the doubling map 


d d 
Ma(© > X= YO xyli) jle DIXI= DO xyliXjjl. 7D 


i,j=l i,j=l 


It is a homomorphism from M4 (C) onto a subalgebra My C Mg (C), 


d 
[XIDYI= >> iy ee lid) (ij kk) £8) 
i,j; k,£ 


d d 
SE vane) 1 )(€€|=DIXY]. (6.72) 


It is thus a positive linear map from Mg(C) onto Mo C M,2(C) where it is 
invertible: 


d d 
Mo > Xo= Yo xyli X Zi] D'IXo] = Yo xyli) j| € Ma). 
i,j=l i,j=l 
(6.73) 
Let d = 2 and |0) , |1) be the fixed ONB in C?; when applied to the permutation 
invariant state in the previous example, the doubling map gives the state 


(00)(OO|+J11)(11|  2F 
+ 


1 
= z (ta + (2F — 1)(o1 ® o1 — 02 Q 02) + 03 @03) 


1 O002F=1 

1 0 00 0 

5 0 00 0 
2F—100 1 


2 2 
RẸ := Dip] = 


* (|00)(11| +|11)(001) 
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If thus turns out that RO as defined in (6.13) equals Re so that, for F Æ 1/2, Ro 


is entangled with concurrence C (RP) = |1 — 2F | (see (6.14)) and entanglement 
of formation (see (6.15) and Theorem 6.2.1) given by 


1+2/F0 F) 


Eg[RO] = Ap ( ; 


) = E @[Ma(C), A]. 
PF 
3. In [62], Proposition 6.3.7 has been used to compute E © [M3 (C), A] where 
F 


1 3F=1 3F-1 
ları Í 3 


and A consists of diagonal matrices in this representation. While for d = 2 there 
is only one optimal decomposition achieving E © [M2 (C), A], when d = 3 more 
F 


onti ss (3) 

ptimal decompositions appear. Indeed, one can decompose pp by means of 
the unitary operator U : C? > C? that implements the permutation (1, 2, 3) > 
(3, 1, 2): 


1 1 1 
pp = ZO Ol+ ZUlON lUT +z U’ |b GlU, (6.74) 
where 
1 a + 2b cos 0 3 
|6) = = | a — 2b cos(9 — 1/3) | , a:= V3F, b:= 50 — F). 
3 (a — 2bcos(0 + 7/3) 2 


It turns out that, for 0 < F < 8/9, the function r (F) = S (| @)( @|JA) is convex, 
whence 


2— F+2/2F0 F 
B,pMa(C). Al =n ( L an) 


1+ F-2/2F( — F) 
+2n( 6 Je 


(6.75) 


There exists a value 0 < F* < 8/9 such that Eo [M3(C), A] is achieved at a 
F 


unique decompositions of the form (6.74) given by 


I JF +/2F( =F) 
a=, JF = /FO = Poe Vs 
3 (IVF —./FO—F)j2 
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for F* < F < 8/9; while, for 0 < F < F*, two optimal decompositions of the 
form (6.74) appear with 


a+ 2bcos6 ¢ 
lF) = — | a — 2bcos(t/3 F OF) | , 
V3 a — 2bcos(r/3 + OF) 


where the angle 0p varies with F. According to Proposition 6.3.5, all linear 
convex combinations of the projections onto these vectors also provide optimal 
decompositions. When 8/9 < F < 1, the function r(F) = S(|¢@)(@|JA) is no 
longer convex and one cannot use Proposition 6.3.7; in this case it is the close 
relation of H, (M) with the entanglement of formation which is of help. Indeed, 
(6.75) coincides with the entanglement of formation of the d = 3 isotropic states 
(6.3) for 1/3 < F < 8/9 as calculated in [351]. 


In order to expose the relation between the entanglement of formation (6.5) and 
E,[M, M] in (6.62), set M = Mg (C) := Ma(C) & Ma(C), M = Mq(C) embed- 
ded as Mg(C) @ l4 into Mg (C) and p; = | Y; )( Yj |. Since the marginal density 


matrices Py, = p;|M, it turns out that 
Erle] = Ep[M,2(C), Ma(©)] . (6.76) 


Further insights into the connections between these two notions, with particular 
reference to Examples (6.3).2,3, come from [63] 


Proposition 6.3.8 Let A C Mq(C) be the maximally Abelian subalgebra corre- 
sponding to a fixed ONB {li}; and D[-] is the doubling map (6.71); then, 


E,[Ma(C), A] = Ep [M42 (C), Ma(©)] . (6.77) 


Proof Suppose p= }_; A; P; achieves E,[Ma(C), A]; then, with p= 
yds 


d 
D{plMa(C) = Tr2(Dlpl) = È rilé)(i| = plA implies 


i=1 
Epjpi[M,2(C), Ma(C)] < È A; S DLP] Ma(C)) = L AS (PTA) 


= E,[Ma(C), A]. 


Vice versa, let D[p] = oF vj Qj achieve Epl Ma2 (C), Ma(C)]; if the optimal 


decomposers Q ; were of the form Q ; = Ke diel kk )( ££ |, by the inverse doubling 
map (6.73) one would get a decomposition of p that could be used to reverse the 
previous inequality and thus prove the result. The decomposers Q ; are indeed of the 
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claimed form as they are one-dimensional projections that can always be recast as 
follows 


_ VDI) % |v Dipl 
a Y | Dip] |Y ) 
Because of (6.72), it turns out that D[p]” = D[p”] whence, by power series expan- 


sion, /D[p] = DIVA]. 


2 
Example 6.3.6 In [351], the entanglement of formation of an isotropic state pa ) 
was computed by (1) considering the twirling (2) of suitable vectors of diagonal 
form, |®) = ye ı Jf lii), with Dar to the chosen ONB, and by 2) minimizing 


the von Neumann entropy S (0 a > hi log ni of the marginal density matrix. 
i=l 


2 
Choose one such |@) from an optimal decomposition for Erlo% )] and construct 
the density matrix 


d2 1 = = 
RẸ = U, @Ux|®)(®|U-' @ Uz! 
Tv 


by using the permutation operators Ur. Since, by definition, Ur ® Ur are sym- 
metries for the isotropic state p@, using Proposition 6.3.6 one deduces that 
E RD [Ma (C), Ma(C)] = s (o Ji On the other hand, in terms of the doubling 
map (6.71) and using (6.67), 


a? d E 
RY? = Dip] = TÈ lo) (4lü 


d 
where p is as in (6.68) and |¢) = > /pi li). Finally, from Proposition 6.3.8, it 
i=] 
results 
E o [M4 (©), A] = E pa» [Me (©), Ma(©)] = 8 (09) . 
F F 


In this way one can use the results of [351] to extend the computation of 
E,) [M3(C), A] to those values of F € [8/9, 1], where the methods employed in 
Example 6.3.2.3 are useless. 


6.3.2.1 Trace-Distance and Fidelities 

In this section we review some mathematical techniques that are used to compare two 
quantum states of a system S; the importance of such an issue will become apparent 
in the next chapter when we shall deal with the compression and retrieval of strings 
of qubits . We shall assume S to be an N-level system. 
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Definition 6.3.4 Given p1,2 € S(S), their trace-distance is given by 


1 
D(p1, p2) := 5 Tr|pr = pa . (6.78) 
Namely, the trace distance of two density matrices is defined as half the trace- 
norm || — p2||ır of their difference: D(p1, 2) is a proper distance on the state-space 
S(S). 
Proposition 6.3.9 The trace distance enjoys the following properties: 
1. Let P € My(C) be any orthogonal projector, then 


D(p1, p2) = max Tr(P (p1 — p2)) . (6.79) 


2. The trace-distance monotonically decreases under completely positive trace- 
preserving maps F : S(S) + S(S): 


D(EFlei], Fle2)) < D1, p2). (6.80) 


3. The trace-distance is jointly convex: 


DO Ajoj Aja) 
j j 


IA 


XCA DO; oj), (6.81) 
j 


where Aj > 0 and yi Aj=l. 


Proof As seen in Example 5.5.6, p1 — p2 = A — B and |p; — p2| = A+ B, with 
A, B positive orthogonal matrices A, B > 0, AB = 0, so that TrA = TrB since 
Trp1,2 = 1. Thus D(p1, p2) = TrA = TrB. Let P be any projector, then 


Tr(P(A — B)) < Tr(PA) < TrA = D(p1, p2) , 


for Tr(P B) is a positive quantity and P projects onto a subspace. Further, if this 
subspace supports A, it annihilates B and the maximum is achieved. 

The second property is proved as follows: let P be the projector which achieves 
the trace distance D(F[p1], F[p2]), then, because of the assumed trace-preserving 
character of F, 


D(p, o) = TrA = TrE[A] > Tr(PF[A]) > Tr(PF[A]) — Tr(PF[B]) 
= Tr(PF[A — B]) = Tr(P(F[p1] — Flp2])) = DF Ip), Flp2)) . 
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In order to introduce some useful notions of fidelity, let us begin with a simple 
observation: the closer two state vectors %1,2 € H = C™ to each other, the closer 
to 1 is | (w1| Y2) |. Indeed, the latter quantity is 1 iff y = ¢ (a part for an overall 
multiplicative phase) and vanishes when Y  @. This idea extends to density matrices 
of an N level system as follows. 


Definition 6.3.5 (Fidelity) The fidelity of two density matrices p1,2 € S(S) is 


Fpi, p2) = Try Vpr PPr = Tey VAPA PRVA 
= Tr | Vv] - (6.82) 


If py = | Y1 )( %1 | =: Pı then VPI = P; so that 
F(Pi, p2) = y (%1 | p2 l1). (6.83) 
Thus if p2 = | 2 )( %2 | =: Po, then F (P1, P2) = | (1| Y2) |. 


Proposition 6.3.10 The fidelity enjoys the following properties: 


1. Let 


wY) be a purification of p € S(S) of the form 


N 
we) = vavi @ Vii) (6.84) 
i=l 


where {li}, is an orthonormal basis in H = CN and U any partial isometry 


such that U'U projects onto the orthogonal complement of K er (p) and V is any 
unitary matrix. Then, 


, (6.85) 


Uv 
"a 


F(p, = max (7 
(p1, p2) ma en 


that is the fidelity is the largest such scalar product achievable by fixing one 
purification and varying the other. 

2. The fidelity does not depend on the order of its arguments. Further, F (p1, p2) = 1 
if and only if p1 = p2, otherwise 0 < F (py, p2) < 1. 

3. Let € = {Ei }icr denote any POVM with elements in My (C), then 


F(p1, p2) = max| X Te(p1 Ei) Tr(m Ei) : Ev € E}. (6.86) 


iel 


4. The fidelity is jointly concave, namely if pi2= >>; Nos with 0< Ai < 1, 
X; à = lando}? € S(S), 


F(p1, p2) = X.A FG}, 07) . (6.87) 
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5. The fidelity monotonically increases under the action of trace-preserving com- 
pletely positive maps F : S(S) + S(S): 


F(Floil, Flp21) = Fer, p2) (6.88) 
Proof It is easy to check that Tr, | wev x wy | = p, so that (6.84) is a purification 


of the mixed state p. One computes, 


N 


a (i | UJP VPU lj) (il Vi Vi li) 


_ (elves SPS PU VVA) < IVPP = F(01, 62) « 


U2V2 
(vp 


U V2 
A 


where T means transposition. Further, the upper bound is achieved by choosing 


Vz = Vı and U2 = W'U, with W such that ./pz,/p1 = W|./p2./pil. 
From the previous point, the second point follows at once. 


One expects a relation between trace-distance and fidelity of the kind: the smaller 
the trace-distance, the closer to 1 the fidelity. That this is indeed so is the content of 
the following 


Proposition 6.3.11 Given p12 € S(S), the following bounds hold 


1 — F(p, p2) < D(pi, p2) < V1 -— F? (p1, p2). (6.89) 


The following proposition establishes that if p1 € S(S) is close to p2 in the sense 
that F (p1, p2) = 1 while while F(p2, p3) = 0, then also F (p1, p3) = 0. 


Proposition 6.3.12 ([20]) Let p1,2,3 € S(S), then Fij := F? (pi, pj) satisfy 


Fa < Fa + 20 — Fiz) + wW (= Fiz) F3 , (6.90) 


where Trp3 = 1, but Trp 2 < 1 (subnormalization). 


Proof Notice that subnormalization does not alter either the definition of fidelity or 
the first property in Proposition 6.3.10. Let then |W) be a fixed purification of p1, 
choose |W,3) in order to achieve F}2 and F3. Further, adjust the phases of the three 
vectors so that 


Fr = (4%)? , Fig = (Wi)? , F3 > (WW) 


Setting |W) := |W) — |W), one estimates 


(|W) = (Wal Va) + (Wal Yo) — 2 (il Ye) < 2(1-V Fin), 
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for subnormalization gives (Wi,2| 2) < 1. Then, from (¥3| Y3) = Trp3 = 1 and 
the bound 


y Fis = (| Y3) = (2| Y3) + (WY) < Fn + | (WY) %) | 
< F3 + V(W|Y) < F3 + y2 —-V Fiz), 


the result follows. 


Let p € S(S) correspond to a mixture {A;, pj}, p = ae Aj pj, Subjected to the 
action of a trace-preserving completely positive map F : S(S) +> S(S). Then would 
like to keep track of how much F[p] differs form p in the mean: this is well described 
by 
Definition 6.3.6 (Ensemble Fidelity) The ensemble fidelity relative to a mixture 


{A;, pj} and a completely positive action F is defined as the ensemble average of 
square fidelities, 


Fa (jp) F) = Ag Fp, Flo) - (6.91) 
J 


We shall also denote by 
F,(p, F) := sup{ Fav (A, P;}, F) : P? = Pj = P , (6.92) 


the supremum of the ensemble fidelities over all possible decompositions of p as a 
mixture of pure states. 


Example 6.3.7 ([20]) Let |i) € H = C?, i = 1,2,3, be an othonormal basis and 
consider the mixture represented by 


p= pil Yi Yil + pal Y1) (Y1 | + p3l Y3)( 31, where 
lY) := cosa |1) + singa |2) , v2) := sina |1} + cosa |2) 


and |3} is such that 


(3l Yi) = (3l Y2) = (il Y2) = sin 2a . 


Suppose the system is subjected to a trace-preserving completely positive map 
such that 


Chea Cdr = 111, FUY) yll =12)(2] 
1 1 
[lysis ll = sldi val + sl ho) ya. (6.93) 
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The purification of a state p € S(S) actually couples S to an ancilla and this 
coupling is embodied by an entangled pure state |v). The following fidelity reflects 
how much the action of an operation on S described by a completely positive trace- 
preserving map F : S(S) > S(S) preserves this entanglement. 


Definition 6.3.7 (Entanglement Fidelity) The entanglement fidelity of p relative to 
F is defined by the square fidelity of the states | YW, )(W, | and F @ id[| % )( ¥ |], 
where |W) is any purification of p: 


Fent(p, F) := F?(| W,)( Pol, F @ id[| % )( Yp i) : (6.94) 
Relations between these various fidelities are as follows [266]. 
Proposition 6.3.13 
0 < Fent(p, F) < Fav < F(p, Flp) <1, (6.95) 


where Fay is any ensemble fidelity corresponding to a decomposition of p. 
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As seen in Sect.5.5.7, entanglement is associated with the presence of non-local 
correlations and with the action of non-local operators. Some of the uses of entan- 
glement have been presented in Sect. 6.1; in the following, we shall first consider in 
more detail the non-locality issue and then the possible advantages that might come 
from the use of entangled states in quantum metrological scenarios, whereby one 
estimates the value of a suitable parameter by encoding it in a quantum state [46, 83]. 
Definition 6.4.1 Given a bipartite system S12 = $1 + Sz, strictly local operators are 
those pertaining to one of the two systems, only, namely those of the form X1 &® 12 
with X; € B(H,) and Il; ® X2 where X2 € B(H2). Their products 


Xı8 X = (xı & 12) (1 & X2) 
are local operators, while sums of local operators are non-local. 


Clearly, strictly local operators commute and can thus be measured simultaneously. 
The statistics associated with separable bipartite pure states va } = |v) 8 |¥2) € 
Hı ® Hz does not exhibit correlations between the two subsystems. Indeed, the 
correlation functions ( we |X, ®@ X2 Wis? ) factorize 


(Ws? | X1 @ Xo [Wy ) = (W1 | Xi lo) (2 | X2 l2) 
= (V7 | X1 @ l|Y ) (My | @ X21M%q"). (6.96) 
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The reverse implication is also true and considering observables X1,2 = X i 2 rather 
than generic operators is sufficient to show it. 


Lemma 6.4.1 Jf |12) € Hı ® Hz is such that all correlation functions of local 
observables Xı ® X2 factorize as in (6.96), the state vector is separable. 


Proof Given the Schmidt decomposition (5.169) 
d 
1 2 
lyn) =J V/A; ws \e lus ’) , 
j=l 


choose X; = | yP yP | and X2 = | yP yP |. Then, the hypothesis together 


with the orthonormality of the vectors wi?) among themselves, as well as that of 


the vectors wh, yields Ax ke = AÀ kA, so that only one Schmidt coefficient does 
not vanish and |W%2) = |~1) 8 |). 


Example 6.4.1 Consider a two-qubit system in the symmetric state (5.174) and 
the local 2-spin observable oz ® a, where, in the computational basis (see Exam- 
ple 5.5.9), o; |0) = |0} and c; |1) = — |1). Then, one computes 


(Yoo | oz ® oz |Woo) = (êo ĉo) =1, 
while, using the orthogonality to |oo) of the Bell states in Example 5.5.9, 


( oo |0: @ | Poo) = ( Yoo | 18 oz Êo) = (o| Ui) =0. 


As in the classical case, by passing from pure to mixed states, one introduces cor- 
relations. Unlike in the classical case, one need distinguish between non-local and 
non-classical correlations, these latter being identified by non-vanishing discord, 
rather than by non-vanishing entanglement. 


Remark 6.4.1 As seen in Sect.2.4.4, the mutual information of two random vari- 
ables A and B is expressed by (see (2.93)) 


I(A; B) = H(A) — H(A|B) = H(B) — H(BIA) (6.97) 
= H(A) + H(B)— H(A V B), (6.98) 


where H(A|B) and H(B|A) are conditional entropies as in (2.90). Given a bipar- 
tite quantum system A + B in a state pap acting on H = Hy ® Hpg, the classical 
Shannon entropies are replaced by the von Neumann entropies S(p,4), S(pg) and 
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S (pap) of the reduced density matrices pa = Trg (pag) and pg := Tr(paz) and of 
the bipartite state p4 g. Therefore, the quantum analog of the quantity in (6.98) is 


I9(A; B) := S (pa) + S (pB) — S (pas) - (6.99) 


On the other hand, in order to replace the conditional entropy H(A|B), one has to 
take into account that gathering information about system B unavoidably changes it 
as in Definition 5.6.1. Suppose one employs a projective POVM P? := {P 2 hier € 
B(Hg); then, its measurement on the system A + B is described by the TPCP map 
(see (5.199)) 


Fpslpasl = >) la @ P? pag la ® PP. (6.100) 
j 


As we have seen in Sect. 5.6.2, this map arises from the fact that measuring the 
outcome j collapses p4p into the bipartite state 


l4 8 P? pan l4 8 p? 
Trg (pp PP) 


with probability Trg(pB P?) A (6.101) 


It is then natural to define the state of system A conditioned on such a measurement 
outcome as 


Trg (l4 & P? PaB l4 8 PP) 


Pajp i= ; (6.102) 
ak Trg (pg PP) 
the conditional entropy of system A conditioned on the POVM P? by 
S (aiP”) := $ Trg (pg PP) S (Pape) À (6.103) 
j 
and finally replace the right hand side of the first equality in (6.97) by 
Jo(AIP®) := S (pa) — s (aiP”) i (6.104) 
Indeed, consider a fully commutative setting where 
pas =>, pavai, j) P} @ PP , (6.105) 


i,j 


with [pA }i, respectively {P 2 }; orthogonal projections of maximally Abelian sub- 
algebras of B(H4), respectively B(Hg). Then, in such a case the measurement of the 
POVM P® does not disturb the state PAB Since the map in (6.100) fulfils Fps [p48] = 
pap and one finds 


pa =X pa@ PA, pp =X pa) PP 
- - 


J 
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where pa(i) := 0; pavs (i, j), pe(j) = Do; Pave, j) and 


L, 
Pape = 2 Pais=j PA, pag=;j(i):= pasid 
thus retrieving the classical conditional probability in (2.89). Then, it follows that 
S (AIP?) = H(A|B) and Io (A; B) = Jọo(A; P2). In general, however, Io(A; B) 
> Jo(A; B). It is then natural to introduce the so-called quantum discord as a mea- 
sure of non-classical bipartite correlations between two quantum systems A and B 
[234,235,269,361]. 


Do(A|B) := I9(A; B) — Jọ(A|P”). 


The discord is asymmetric, Dg(A|B) 4 Dg(B|A), non-negative and vanishes if and 
only if the projective POVM leaves the state p48 invariant [269]. Certainly, this is the 
case for quantum (A)-classical (B) separable states of the form pag = >> i Àj a? ® 


p? where A; > 0,5 F A; = 1 and p? are generic states of system A, but not in 
general for separable quantum (A)-quantum (B) states p48 = 2 j Aj ae & a? 
p? states of B that are not orthogonal projections. It follows that quantum systems 
are in general discordant even if their state is not entangled, reflecting the fact that 
unless of the classical (A)-classical (B) type as in (6.105), they are not entirely 
classically correlated for accessing information about one of them may disturb the 


global state. 


with 


6.4.1 Identical Particles 


In this section we address the issue of non-locality in systems of identical particles, 
namely for assemblies of quantum systems that cannot be identified and thus dis- 
tinguished one from the other. The physical consequence of indistinguishability is 
that the state vectors of Bosonic systems must be symmetric under the exchange 
of particles, and anti-symmetric in the case of Fermions. These two physical con- 
straints collapse the tensor product structure of the Hilbert space onto symmetric and 
anti-symmetric components and thus demand for a reformulation of the notions of 
separability and entanglement that, as seen in the bipartite case, are based exactly 
upon the tensorization of Hilbert spaces and of algebras of operators. 

In order to be as concrete as possible, we shall consider a physical scenario 
consisting of N Bosonic ultra-cold atoms trapped in a double well-potential. As seen 
in Example 5.4.4, the most appropriate description of identical Bosons is by means 
of the second quantization formalism and of the Fock representation: we shall denote 
by |vac) the vacuum state and by a, a’ and b and b’ the annihilation and creation 
operators of a particle in the left, respectively right well, with [a , at] = [b , bt] = 1, 
[a* , b*#] = 0. 
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Since the total number of Bosons is fixed, the Fock number states 
ataļk, N — k) =k |k, N —k), b'b|k, N—k) =(N—Kk) |k, N — k) 


form an orthonormal basis in the Hilbert space C+! of the double-well. They are 
constructed by acting on the vacuum, 


a (atk tN 
; ~ JEN — ky! 


and represent the physical situation where k Bosons are located in the left well and 
N — k in the right one. 

In the second quantization formalism there is no pre-defined algebraic tensor 
product structure as for distinguishable particles; in the latter case, one starts out 
with the tensor product of the algebras of operators acting on the Hilbert spaces of 
the single particles: for instance, in the case of one qubit the operator algebra is the 
2 x 2 complex matrix algebra M2 and in the case of two distinguishable qubits, it is 
the 4 x 4 matrix algebra Mz © M2. 


|vac) , (6.106) 


Example 6.4.2 Consider the case of two-identical qubits: if they are of Bosonic 
type their sector is the symmetric one, C? in C4 = C @ C’, otherwise C if they 


behave as Fermions. Using the standard basis, the symmetric sub-space is gener- 
|O1) + |10) 
ated by |00), |11) and |wsymnm) = -a 


01) — |10 
[asymm = na It follows that mixed states cannot be obtained by sym- 


metrization or anti-symmetrization of density matrices, but only by convex combina- 
tions of projectors onto symmetrized or anti-symmetrized state vectors. For instance, 
a two Bosonic qubit state of the form p ® pis certainly symmetric under exchange of 
the two qubits. However, in order to be a Bosonic mixed state, it cannot be supported 
by the anti-symmetric component of C4; namely, 


, while the anti-symmetric one by 


( Wasymm leB p |\WVasymm ) = 2(po0p11 — |201 7) =0, 


where pi; = (i |p |j), i, j = 0, 1. This is only possible when p is a pure state with 
vanishing determinant, never for generic mixed states within the Bloch sphere. 


In absence of a natural tensor product structure, either of the Hilbert space or of 
the algebra of their observables, new approaches to locality (of observables) and 
separability (of states) must be developed (see the review [46] for a comparison 
of various possibilities). The approach addressed in the following is called mode- 
entanglement: it makes use of the formalism of second quantization and is based on 
the observation that the main property of local observables A @ | and | & B fora 
bi-partite system consisting of distinguishable particles is that they commute. 

In the case of two Bosons, the tensor product structure can thus be replaced by 
pairs of commuting sub-algebras (A, B) generated by {a, at}, respectively {b, b"} 
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whereby operators of the form A B with A € A and B € B can be termed local 
with respect to the pair (A, B), or (A, 8)-local. Then, the notions of locality of 
observables and (bipartite) separability of states extend as follows. 


Definition 6.4.2 Given two commuting sub-algebras A and 6 of the algebra of 
operators M describing a system of identical particles, an operator M € M is said 
to be (A, B)-local if of the form M = AB with A € A and B e B. Furthermore, a 
state p on the Bose algebra M is separable with respect to the pair (A, B), or (A, B)- 
separable, if, on (A, G)-local observables, it splits into a convex combinations of 
products of expectations with respect to other states, namely if 


Tr(A B) = X ATP A) Trp! B), = [0, SSA =1, (6.107) 


l 


forall A € Aand B e€ B. 


The simplest examples of (A, B)-separable states are the Fock number states 
in (6.106). Indeed, any generic operator A, respectively B, can be approximated, 
with respect to a suitable operator-topology, by polynomials P4 (a, a‘), respectively 
Ppg(b, bÝ), in the annihilation and creation operators a, a', respectively b, bt. Then, 
using the commutation relations of a, a‘ and b, bř one can reduce the evaluation of 
the correlation functions (k, N — k| Pa(a, a‘) Pg(b, bÝ) |k, N — k) to that of sums 
of the form 


A B tke t +k p+N—k 74 +N —k 
SO HA nate, (vac Laetk (qtymatk prot Nk NE lyac y , 


Na,Ma Np,Mp 


Taking into account that the vacuum state is a Gibbs state at temperature T = 0, from 
(5.195) in Example 5.6.2.3, it follows that the above double sum factorizes into the 
product Dn Una,na Na! Dn Lnp,n, nb!. On the other hand, 


X Hinana na! = ba in m, (vac | a"° (at)! |vac) 


Na Na,Ma 


and analogously for `, , Hny.n, nb! Consequently, the expectation of any (A, B)- 
local observable A B, A € A, B e B factorizes, 


(k, N—k| AB\|k, N—k) = (k,N —k|A|k, N —k) (k, N—k| B\k, N —k), 
(6.108) 
and shows no correlations among these commuting observables. 

The definitions of locality and separability given above reduce to the standard ones 
in the case of two distinguishable particles. For them, one identifies the sub-algebras 
A and B with the single particle algebras of operators, M with their tensor product. 
Remark however that, while in the case of distinguishable particles the tensor product 
structure is somewhat taken for granted and one does not specify that locality and 
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separability always refer to it, it is not so in the case of identical Bosons. Consider 
in fact, a Bogolubov transformation, describing the action of a beam splitter either 
in quantum optics (see Sect.5.5.1) or in cold atom interferometry: 


a—b d a+b (6.109) 
c= $ = : ; 
J/2 J/2 


It transform the mode operators {a, at}, {b, bÝ} into new mode operators {c, c), 
{d, dÝ} such that (A, B)-local observables are no longer (C, D)-local and (A, B)- 
separable states are no longer (C, D)-separable, where C and D are the commuting 
algebras generated by c, c’ and d, dÌ. 


Example 6.4.3 Let M := ctd be a (C, D)-local operator, where C and D are 
obtained from the commuting two mode sub-algebras constructed with a, a‘ and 
b, b? by means of the Bogolubov transformation (6.109). It cannot be (A, B)-local 
according to Definition 6.4.2 since 


ata — bb +ab-— abi 


td= 
i 2 


is a sum of (A, B)-local operators. Furthermore, since the (A, B)-separable states 
|k, N — k) in (6.106) are eigenstates of the number operators at a and bÌ b, one finds 
2k — N 

2 $ 


(k,N—k\|c'd|k,N—k) = 


whereas (k, N —k|c'|k,N —k) =(k,N—k|d|k, N — k) = 0. Thus, accord- 
ing to Definition 6.4.2, the (A, B)-separable states |k, N — k) cannot be (C, D)- 
separable. 


6.4.2 Quantum Metrology 


In a typical classical metrological scenario, one deals with a probability density 
Po(x) = 0, J, , dx po(x) = 1, associated with a stochastic variable X with a contin- 
uous set of outcomes x € J, continuously depending on an unknown real parameter 
0 in a suitable interval. The issue at stake is the estimation of 0 through a suitable 
estimator © (X), basing on the statistics of X. In the following, we shall assume 
Po(x) and O (x) to be smooth and consider unbiased estimators whose mean-value 
with respect to pg(x) is 0: 


(O(X) — 0) p, = [ox po(x) (0%) = 0) =0. (6.110) 
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Taking the partial derivative with respect to 0 of the above equality and using the 
normalization of pg(x) yields 


f dx Dppo(x) (@(x) - 8) = f dx po(x) dp log 0) (06) - 0) = 1. 
I I 

(6.111) 
Then, the Cauchy-Schwartz inequality (5.1) with w(x) = V paG) (00) = 0) and 
pol(x) = y po(x) 0p po (x), 


(to 
X 
1 < fa mww(ew e) [pe 


leads to the so-called Cramér-Rao bound 


A2 0 := fa patx)(O(x) = o) > i (6.112) 
ve I T F (pe) 
where 
2 
2 (apo?) 
F (py) = 1 dx po(x)(p log po(x)) = i dx\—_*, 113) 
I I po(x) 


is the so-called Fisher Information. The Cramér-Rao inequality is an estimator inde- 
pendent lower bound to the mean square error of any unbiased estimator; therefore, 
the smaller the Fisher information, the larger the error associated with any given 
estimation protocol. 

In a quantum scenario, a standard way to encode a continuous parameter @ in a 
smooth way by means of a state p € S(S) of a d-level quantum system is to select 
an observable X = X' € (C4) and rotate the state unitarily: 


p pp =e FOX p? (6.114) 


In the following we shall consider the case of states with non-degenerate non- 
vanishing eigenvalues, for sake of simplicity. 

As we have seen in Sect.5.6.2, these states provide the statistics relative to 
any POVM € = {E};}jej. Consider a continuous POVM € = {Ey}xex such that 


f. x dx Et Ex = 11; accordingly, there remains defined a probability distribution 
E —_ t 
pE (x) i= Tr( 0 Ei Ex) ; (6.115) 


with associated Fisher information F ( A ) given by (6.113). The derivative in the 
rightmost equality can be written as 


Op po(x) = Tr (dapo T.) = Re(Tr(Lo po Tr)) (6.116) 
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where Iy := Ei E, and Lg is the so-called symmetric logarithmic derivative defined 
by 
L + po L 
i= ae (6.117) 
In a commutative setting where [Lg , pg] = 0, the above expression has Lg reduced 
to the standard logarithmic derivative. In full generality, using the spectralization 


po = Ffar (9) (rj) rj) |, 


+00 
tg =2f dt e'? O_pge* ” (6.118) 
0 
(rj (9) | Oppo \rK() ) 
=2 rir, (6.119) 
Xu Otre OKO! 
POLON 


as can be seen by inserting (6.118) into (6.117) and observing that Ogg only involves 
indices relative to non-vanishing eigenvalues. 
By using the positivity of pg and I1,, we rewrite 


Tr(Lo po T.) = Te( VT Lo V0 pa VTi) ; 


Then, the Cauchy-Schwartz inequality for square matrices (5.29), the second equality 
in (6.113) and the fact that f. x dx Iy = Il yield the estimates 


(Re(Tr(Lo 06 n,))) < Tr(Tx Lo po Lo) Tr( Ty po) (6.120) 


F(p) < Tr (0 L3) , (6.121) 


One thus obtains the so-called Quantum Cramér-Rao bound 


Mp > =... 6.122 
pi F2(pp) a, 


where F 2(p9) is the so-called Quantum Fisher Information, which, using (6.112), 
(6.117) and (6.119), reads 


F2(p9) := Te (o 13) = Tr(Lo dapo) (6.123) 
E [Crj @ | opo Ire) |” 
=9 2 TOETO (6.124) 
rj(0)+ px(0)>0 


The inverse of the Quantum Fisher Information provides a lower bound to the mean 
square error relative to any unbiased estimator © of the parameter 0 encoded into 
the quantum states pg by means of a continuous POVMs E = {Ex}xex. 
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In the case of the unitary encoding (6.114), the eigenvalues of pg do not depend 
on @ and the Quantum Fisher Information is bounded by the variance of the generator 
X of the transformation. 


Proposition 6.4.1 If pg = exp(—i0 X) pọ exp(i 0 X) with X = X", then [233] 
Q 2 2 2 2 
FÊ (o) < 4A X, AZ, X := Tr(p X°) — (Treo x)) : (6.125) 


Proof Since opo = —i[X , po], (6.124) yields 


F2(p9)=2 Y os [ri |X (re)? (6.126) 


j.k=1 
rj+pk>0 


On the other hand, by means of the eigenvectors and eigenvalues of pọ, that, unlike 
the eigenvalues, do depend on 0, one can lower-bound the variance as follows: 


d d 
A X= Y rj OIX O|? - (Sori OIX) 
j,k=1 j=l 
4 rj Tk E 
= E H OXR — (Dri OIX) 
j,k=1 j=! 


a 


d 
= E H OXO ra ri OIX GOL 
py j=l 


- (Za HOIXIGOY , 


where the second equality follows from | (rj; (9) | X |rk(6))| = |(rk(®) | X Ir;(0)) 
for X is self-adjoint. Applying the Cauchy-Schwartz inequality to the last term, 
417; = 1 yields 


d 


d 
2 
(So VF VF (ri) IO )) < Sor; OIX O 


j=l j=l 


so that 


d 
rj +r 
Ae X= D AS OXO)? 
j#k=1 
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The result then follows from the fact that 


(rj — rk)? 


Srjt+re. 
rj +r 


For state vectors, the Quantum Fisher Information for unitary encodings coincides 
with 4 times the variance of the generator of the transformation. Indeed, if pg = 
| Wo )( Wo |, choosing rı = 1,r; = Ofor j > 1, and |r) (0)) = |W), (6.126) becomes 
2 
F2 (wg) =4 FO OIX lod!” = Yol X Ir) OIX Wo) 


j>l j>l 
2 
= 4 (do X? lyo) — 4 (vol Xlvo)) =443,X. 


This result can now be used to highlight how entanglement can be used to increase the 
Quantum Fisher Information and then the accuracy with which parameters encoded 
into quantum states can be measured. 

Let us consider a quantum system consisting of N qubits that are prepared in the 


. r N 
following two states in C? : 


(a 
HEY, ln) = ——_—_., 6.127 
I+) |W) A ( ) 
where using the standard basis o3 |0) = |0), o3 |1) = — |1), 
10) £11) oN ae 
IŁ) = Sy lp =) Sl esi). j=0,1,+. 
Va J J J Jis J 
N times 


While |+)®% is completely separable with respect to any pair of qubits, |Wy) is 
instead fully entangled, being a generalization of the GHZ state discussed in Exam- 
ple 5.5.10.3. 

With p the projection onto one of these two states, let an angle 0 € [0, 27) be 
encoded into 


N 
: h i 
po = exp( -+0 I7) p exp (4037). JA => a (6.128) 
h h 2 
by a global rotation around the z axis. Notice that 


IN jS 


N 

h , 

AMP ale a ae (6.129) 
j=l 


K _ Nh |0)2% = |1)@N 
J3 IYn) = 7 Va ; (6.130) 
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Then, the global angular momentum has vanishing mean-value in both cases, so that 
the variances are given by 


2 fe 
A? ordi = | JY e| ==N, (6.131) 


2 fe 
A? JN = | JY Wn) | =n. (6.132) 
Thus the Quantum Fisher information increases as N in the separable case and as 
N? in the fully entangled one; accordingly, the error in the estimation of the angle 
0 is bounded form below by 1/N in the first case, the so-called shot-noise, and by 
N~?2, the so-called Heisenberg limit in the second one. 


6.4.3 Identical Particles 


Though the importance of entanglement for enhancing metrological accuracy in 
quantum contexts is apparent from what precedes, nevertheless entanglement is by 
no means necessary. Several ways are known by now how this limitation can be 
surpassed and we refer to [83] for a comprehensive review about this issue. Here, 
we shall focus upon the use of indistinguishable particles. 

As we have seen earlier in the Section, for identical particles, the notion of mode- 
entanglement need be introduced as entanglement depends on the modes selected for 
the description of the system. To highlight the consequence of this fact for quantum 
metrology, consider again a system of N ultra-cold Bosons in a double-well potential 
with A, respectively 6, the sub-algebra describing Bosons in the left, respectively 
right well, in an (A, B)-separable Fock state with k Bosons in the left well and N — k 
bosons in the right one. 

We connect this physical setting with a quantum metrological scenario by means 
of the so-called Schwinger representation that associates to the two-mode bosons 
the angular-momentum-like operators 


atb + abt atb — abt ata — btb 


So Se A 6.133 
: 2 E 2i z 2 ( ) 


such that [Jy , Jy] = i Jz. Observe that according to Definition 6.4.2, Jy y,z are 
not (A, 6)-local. Of the unitary operators they generate, exp(i@ Jz) is nevertheless 
(A, B)-local; indeed, 


exp(i0 J,) = exp(i ; at a) opit bi b). 


On the contrary exp(i@ Jy) and Ug = exp(i0 Jx) are not (A, B)-local according to 
Definition 6.4.2, for they cannot be split into the product of some A € Aand B € B. 
Indeed, while the action of exp(i@J_) is such that 


2 


el Je k, N-k) =e! T k, N-k), 
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which thus remains (A, 6)-separable, the action of Ug can be used to non-trivially 
encode the angle 0 turning it into 


lpo) =e 8 ¥ |k, N — k) (6.134) 


which is (A, B)-entangled. 
Furthermore, unlike for distinguishable qubits, the Fisher information associated 
with |p) in (6.134), 


F2 (9) = 4 Aj y- Je = 4k, N — k| JŽIk, N — k) 
=N(2k+ 1) —- 2k, (6.135) 


exceeds N for all k # 0 and k # N. Therefore, except when all identical bosons are 
in one mode and none in the other, despite the (A, 6)-separability of the initial Fock- 
number state, one can beat the shot-noise limit. It is in fact the rotation generated by 
J, which is not (A, 8)-local, what provides the necessary non-locality that allows 
to overcome the shot-noise limit. 
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Quantum Mechanics of Infinite 
Degrees of Freedom 


Quantum systems with infinite degrees of freedom exhibit properties, like relaxation 
to equilibrium, phase-transitions and the existence of inequivalent representations of 
the CAR and CCR , that can satisfactorily be dealt with by means of the methods 
and techniques of algebraic quantum statistical mechanics [80,81, 130]. The point of 
departure from standard quantum mechanics, is that in an infinite dimensional context 
one is usually provided with the algebraic properties of the relevant observables, but, 
in general, not with an a priori given representation on a Hilbert space; the latter 
rather depends on to the physical properties of the systems under consideration 
[324,342,343]. 


7.1 Relaxation to Equilibrium 


As discussed in Remark 2.1.3.4, by discretizing chaotic classical systems, prop- 
erties like the exponential growth of errors or a constant entropy production can 
survive only over times that scale logarithmically with respect to the discretization 
parameter. Indeed, beyond this time-scale, due to the finite number of allowed states, 
quasi-periodicity and recursion appear. While in classical dynamical systems, recur- 
sion can be eliminated by going to the continuum, this is impossible in quantum 
mechanics because of the intrinsic discretization of phase-space, due to A > 0 and to 
the Heisenberg uncertainty relations. However, recursion times can be made longer 
and longer by letting the number of degrees of freedom go to infinity [254,353]. 

If we let N — œ in Example 5.6.1.2, the recurrence time diverges and, unlike 
for finitely many spins, infinite spin-chains may exhibit relaxation to equilibrium. 
Indeed, observe that 
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N-1)/2 N-1)/2 
(N—1)/ i (N-1)/ 2f 


fn, t) := I] ca ie I] cos SFT 


i=l i=l 


2 ot cos 2-24 
= I] cos — = Sn ©, 21) ; 
Ss 2! cost 


then, when N —> om, fy (0, t) tends to a function foo(t) which satisfies foo (t) cost = 
foo(2t) together with f..(0) = 1. By expanding both members of the first equal- 
in t 
ity and comparing equal powers in f, it turns out that foo (t) = A; (5.183) thus 
becomes 
9r j _ 2itBy, Sint 

t)) = he 
PPM) =e z; 


As a consequence, when t — œ, the time-dependent state p?° defined on the infinite 
spin array by the expectations 


ay > pplk) = p99 (oA), 


tends to the state woo such that wo.(o/) = 1 /2 and Woo (o/.) = 0 for all j. 


7.2 inequivalent Representations 


One of the most relevant aspects of quantum mechanics with infinite degrees of free- 
dom is the existence of inequivalent irreducible representations of a same algebra; this 
fact explains physical phenomena such as symmetry breaking and phase-transitions 
[324, 325,342,343]. 

We let N — oo in Example 5.4.2 and set 


n 

| vac), = |0)8% , |i) 4 = | [o2 vac) (7.1) 
j=l 
n 

|vac)y :=|1)8%®, |My = []oflvac),, (7.2) 
j=l 


where i = iji2--+in with i; EN. 

The physical interpretation is straightforward: | vac )+,, represent configurations 
consisting of infinitely many spins all pointing up, respectively down, whereas the 
vector states | i”) )+,) describe local configurations that are obtained from | vac )+,1 
by flipping the spins at the sites specified by i1, i2, in by means of the raising and 
lowering operators o+. By defining the scalar product of infinite tensor products of 
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vectors as infinite products of scalar products of vectors at single sites [353], one 
gets 
n 
KEP JO Jk tame | Cie katie GP 50. 
l=1 

Indeed, in the first scalar product, outside a local region where the spins may be 
flipped, there are infinitely many spins all pointing up (down). Therefore, the value 
of the scalar product is determined by the spins within the region where they are 
flipped: it is O unless the flipped spins on both sides of the scalar product match each 
other. On the contrary, in the second scalar product there are always (infinitely many) 
spins in | j®™ ) that are orthogonal to the ones at the corresponding sites in |i” ). 

It thus follows that the completions of the linear spans of all vectors of the form 
|i )+, respectively | iM) | give rise to two orthogonal Hilbert spaces H4, respec- 
tively Hl). Furthermore, let A be the (local) algebra generated by all products of 
Pauli matrices as in (5.63); the two Hilbert spaces then provide two irreducible, 
inequivalent representations 7,| (A). In fact, suppose X € B(H) commutes with 
(A), then 


P 
i Mix G™), = = 4( vac | JJ g! i TJ o* x nae), 
T=1 s=1 
= p( vac | X |vac)y (i J0 )p 


for IG ox TT o” | vac )+ = 0 unless raising and lowering operators match each 


r=) 
aiher Thus, X acts as 4{vac|X|vac), Il on a dense set of Hy, whence X = 


q (vac |X |vac), Il and the commutant of 74 (A) is trivial. 
Consider now the magnetization m(N) relative to the first N spins, that is the 
operator-valued vector m(N) of components 


m;(N) Seg , i=1,2,3, 


n=l 
and the average magnetization m = (m1, m2, m3) given by the formal limits 


a mi(N) 
mi := lim ——. (7.3) 
N=>+% N 


Choose N > jn È im, 1 < ių < ji, then (k =f, }) 


jm 
aE | m3 (N) G yk = EUN + in — jm) HUYO i |o GF Ye 
l=in 


jm 
cE |m aN) G Je = uY a oa Ye 


l=in 
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for 73/0) = |0), 03] 1) = —|1), while (0| 01,2 |0) = (1| 01,2 |1) = 0. Then, 
: f N) . k=* 
l (ny 3N) my) 
ae RS ikal 
N) 
li +(n) my ,2( (m)) o. 
pli, 


Therefore, the mean magnetization m exists as a weak-limit (see (5.8)), that is with 
respect to the weak-operator topology determined by the representations 774) (A) 
on Hi} ). It thus depends on the representation with respect to which the limit is 
calculated and belongs to the bicommutant 774, | (A)” (see Definition 5.3.1) where 
mo = (0, 0, u), mı = (0, 0, — p). 

If 74, (A) were unitarily equivalent, then there would exist a unitary operator U : 
H4 > Hy such that U tr (A) U = (A); if so, it could be extended by continuity 
to the weak closures 74+, (A). Then, its action on the mean magnetization would 
give rise to a contradiction: 


—u = m3 = Umg U = UŻU =p. 
The vectors | vac }ņ,} behave as vacuum vectors for the spin algebra A; indeed, 


| vac )+,) and the representations 7+, (A) are unitarily equivalent to the GNS rep- 
resentations based on the expectation functionals 


p q P 4 
wri | at | [0®) := y, (vac | J| È] [o eae). 
r=1 s=l r=1 s=1 


Other interesting representations of A can be obtained by means of the following 
expectation functional 


n n 

Ws (1 t) = | [Trov;), (1.4) 
f=1 {=l 

where p is the spin density matrix of Example 5.5.5 and oe is the je Pauli matrix at 


site ig. The corresponding GNS vector | 2, ) can be identified with the infinite tensor 
product of the vector states resulting from purifying p: 


12) = QI» = Q10 @10) + J-S*i1)@11)). 
n n=l 


Further, the GNS representation 7,,(A) can be identified with the infinite tensor 
product 


T4) = Q TMO) = (MC 8 12) . 


n 
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The von Neumann algebra 7; (A)” is not irreducible, but it is a factor since the com- 


ll 
mutant is 7;(A)’ = Qn ® M2(©)). Since 73 
2N i 


As Py:= |] “3 


i= 
else Py|i™ Jo = |i Jo. 

Given | w) € H} and £ > 0, one can find a vector |  )4 in the subset linearly 
spanned by | i™ )4 indexed by the sites within a suitable finite interval 7- such that 
Il )+ —1@)+ll < £; then, by choosing N such that J: N [N, 2N] = 9, one estimates 


o—|0) = 0, the projection 


is such that Py|i™ jọ = 0 if the sites i1, i2, in € [N, 2N], 


(Pv — Diyil < |v — Dipl] + 21d )p iol S 2e. 


Therefore, Py — Il strongly on H4. Instead, Tr(po3) = s yields 


1+s\ fi ssi 
paol ~ Pat (=) ~ ee <s<l 


Therefore, for 0 < s < 1, the GNS representation 7 (A) cannot be unitarily equiv- 
alent to 74 (A). If so, there would exist an isometry U : H, tb Hy such that 


O= lim (2,|m5(Py)|@;) = lim (Q,|U'r,(Py)U|Q;) =1. 
N->-+00 N— +00 


In fact, U| 2, ) € Hy and Py converges strongly and thus weakly on H4. 


7.3 Factor Types 


According to Example 5.6.2, the states ws on the spin algebra A may be interpreted 
as thermal spin states at inverse temperature 


1 l+s 
a Eia 


e the zero temperature state w; is equivalent to the vacuum state w4; 

© wp is an infinite temperature state with the properties of a tracial state (compare 
(5.57)) such that wo(X Y) = wo(¥ X) for all X, Y € A; 

e for0 < s < 1,w, is a thermal state with no specific properties. 


Correspondingly, the von Neumann algebras 7” (A) that arise from the strong clo- 
sures of the spin algebras ms (A) on the GNS Hilbert spaces H, are instances of the 
so-called factors of type I, II and ITT [353]. 

The classification of von Neumann algebras starts by considering different possi- 
ble classes of their projections. A projection p = p = p? of avon Neumann algebra 
A is called an Abelian projection if p M p is Abelian. Typical examples in this class 
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are the minimal projections of Example 5.3.4.1: if p € Ais a minimal projection and 
q € Ais another projection, then0 < pqp < p. Therefore, pqp = Ap as all spectral 
projections of pqp are < p and must then be equal to p. Since A is generated by its 
projections q, p A p is Abelian. 

Two projectors p, q are said to be equivalent, p ~ q, if there exists U € A such 
UÏ U = P and UUŤ = q. This is the case for the initial and range projections in the 
polar decomposition (see Remark 5.2.2). A projection p € A is said to be finite if 
A>q=q* =¢ < pandq ~ pimply => q = p. Any Abelian projection p € A 
is finite; in fact, let q < p and p = U' U, q = UU". Then, since q < p => qp = 
pq = q (see Example 5.3.4.1), it turns out that V := pUp = pqU = qU and VÝ 
commute so that 


VÏ V = UqU = p = p= VV =qUUtq =q. 


1. A unital von Neumann algebra A is said to be of type Z if its identity Il can 
be decomposed into an orthogonal sum of Abelian projections p, € A. Typical 
examples are A := L% (X) and B(H) with H a separable Hilbert space. In the first 
case, the characteristic functions of the atoms of any partition of ¥ into disjoint 
measurable atoms are the required Abelian projections. In the second case, the 
projections pn = |n)(n| onto the orthonormal vectors {| 7 )}}nen of any ONB are 
Abelian and sum up to the identity. Then, also. A & B(H) is of type Z, the required 
Abelian projections being given by L4 @ pn. 

2. A unital von Neumann algebra is said to be finite if its identity is a finite projection, 
semi-finite if its identity can be decomposed into an orthogonal sum of finite 
projections. Since the projections pn in the previous point are minimal, type I 
von Neumann algebras on infinite dimensional Hilbert spaces are semi-finite, 
finite if dim(H) = n. 

3. A unital von Neumann algebra is said to be of type // if it is semi-finite, but does 
not contain any non-zero Abelian projection; of type ZII if it does not contain 
any finite projection. 


The trace for finite dimensional systems (see (5.19)) is a particular realization of 
the following general notion. 


Definition 7.3.1 (Traces) [353] A trace on a von Neumann algebra A is a map 
® : Ai +} R4 from its positive elements into the positive reals R+ such that 


POG Ai) = JONDA) VA ER, Ai € Ay 
i i 
@(A)=@(U'AA) WAEA,,UEA unitary. 
The trace is 


e faithful if D(A) = 0 4 A =0 for A € A4; 
e finite if P(A) < +00 for all A € A4; 
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e semi-finite if for all A € A4 there exists A; > B < A with ®(B) < +00; 
e normal if sup ® (Aa) = ® (sup Aq) for every increasing net {Ag} C A+. 


Analogously to what has been proved for states (see Example 5.3.2.3), it turns out 
that if Ø < W for two faithful, semi-finite traces on A, then there exists 0 < X’ < Il 
in the center Z = AN A’ such that # (A) = W(X"A) for all A € A+ [353]. Then, 
if Ais a factor, Z = {AI} and all traces on it are proportional; in fact, given any two 
traces ®;,i = 1, 2, 


Pı < i + 2 =} © = A (1 + G2), 2 < $1 + $2 => G2 = 2(O1 + G2), 


whence ®; = \MAz"o. 

Consequently, any chosen trace on a factor von Neumann algebra can be used 
to assign its projections an intrinsic dimension, thus providing a characterization of 
types [140, 194,353]: 


1. Factors of type J have a semi-finite, faithful, normal trace given by (5.19) whose 
range on projections is discrete and finite for finite type Z, factors, or countable for 
infinite type Jo. factors: an example of the latter case is the spin algebra IT, (A)” 
with respect to the zero temperature state w1; 

2. factors of type JI have a semi-finite, faithful, normal trace whose range on pro- 
jections is the whole interval [0, 1] for finite type / 7; factors or the whole of R+ 
for infinite type I I% factors. An instance of the first occurrence is the spin algebra 
ms(A)” with respect to the infinite temperature state ws when s = 0; 

3. finally, type ZII factors have no semi-finite, faithful, normal traces: this is the 
case of the spin algebra ms ( A)” when 0 < s < 1. 


7.4 Observables, States and Dynamics 


The physical scenario in the examples discussed in the previous section is acommon 
one in quantum statistical mechanics. Indeed, the limit of infinitely many degrees 
of freedom is in general achieved in the so-called thermodynamical limit, where 
one starts with N particles in a finite volume V C R? (or Z? in the case of a lattice 
system) and lets N, V — oo in sucha way that N/V +> p, where p > 0 is a given 
spatial density. 

Each V c R? has its own Hilbert space Hy = L4, (V) of Lebesgue square- 
summable functions and the corresponding C* algebra Ay = B(Hy) of bounded 
operators. Instead, in the case of a lattice system, each x € V carries a Hilbert space 
Hx and the C* algebra Ax := B(Hx), so that Hy = yey Hx and Ay = ®& yey Ax. 
Notice that in the continuous case each Hilbert space Hy is infinite dimensional, 
while in the discrete case it depends on whether the Hilbert spaces Hx at the 
lattice sites are finite dimensional or not. If Vj C V2, set VÐ := V2 \ Vi, then 
Hy, = Hy, ® Hye and Ay, becomes a subalgebra of Ay, by embedding any 
Aj € Ay, into Ay, as Ai ® lyc, where Iye denotes the identity operator on the 
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Hilbert space Hye. It follows that the set Ao := Uy Av is a *-algebra, namely it is 
closed under addition and multiplication of its elements; also, it is naturally endowed 
with the norm Ay 3 X +> ||X|| forall V c R?. 


Definition 7.4.1 (Quasi-Local C* algebras) The normed *-algebra Ao is the algebra 
=h 
of local observables, while its norm-closure A := U Ay is 


V 
known as a quasi-local C* algebra. 


Remark 7.4.1 The notion of quasi-local algebra is physically motivated by the fact 
that the only experimentally accessible observables of infinitely extended quantum 
systems are the local ones. These can then be used to approximate as much as one 
desires the non-local ones. From a mathematical point of view, the construction is 
an instance of inductive limit [140] of a directed net of C* algebras. In the case of an 
increasing sequence {An; }n;en Of finite-dimensional C* algebras An, C An; 4, the 
generated quasi-local C* algebra is called Almost Finite (AF), if the algebras An, 
are full matrix algebras M,, (C) then A is known as Uniformly Hyperfinite (UHF) 
[289,290,328]. 

Suppose an increasing sequence of finite-dimensional unital C* algebras {An }n, 
is represented on a Hilbert space H, then the strong closure of Uii An; is a von 
Neumann algebra M which is called Hyperfinite. An Abelian instance of such an 
algebra is the von Neumann algebra of essentially bounded functions, Lr (X), which 
one can generate by means of the characteristic functions of finer and finer finite 
partitions of V as explained in 

Remark 2.2.3.4. 


Example 7.4.1 [11] Let M,,(C) E Mn, (C) be two matrix algebras; given a system 
of matrix units { E as k=l for the smaller one (see (5.12)), the orthogonal projections 
EY sum up to the enig pee EQ = = l2 € Mn, (C), whence X`; L; Tra (EP )= 
n2, Where Tr2 denotes the trace computed with respect to the Hilbert space C”?. But 
then, using the cyclicity of the trace, 


1 1) p(l 1) pd 
Tro(Eyp) = T(E EQ) = T(E ED) = T(E), 
for all k, p = 1,2,n,, whence n =n, x d, where d := Tro(E) for all k = 
1,2, nı. Let {| f; ) € C”"2}%_| be an ONB in the subspace projected out by ED 
and set 


Q) Zo () 
Ekik: ie) = Enail fe) Fin [Er - (7.5) 
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Since 1 < ky, jj < nı while 1 < ko, jo < d, these are ni x d? = ny matrices in 
Mn (C); moreover, from (5.12) and E())| fy) = | fp) it follows that 


(2) (2) _ pil) d) p) (1) 
E kik): Gija) Enpi in B Epi | fka) Sir | Evi, Epil | fp ) fas | Fig 


(1) d) (1) 

= Õjipı Eki | fka) Fin | Eir | fpo X fa | Eig 
ad) (1) 
E Ô ji pijp Eki | fka V4 Far | Eig 


Ds s pO 
= Sip Öjp Eki he): quqn) * 


Thus, (7.5) defines a set of matrix units in Mn, (C). Set Eb ip := | fea )( fj |; then, 
(2) O g pd 2 
E ki ko): Gija) can be isomorphically represented by Eq i, 8 Ex, j on Ch = Cg 


C4. Therefore Mn, (C) is isomorphic to Mn, (C) & Ma (C). The matrix algebras 
Mn; (C) E Mn, H (C) that generate a UHF algebra A must be such that any n; must 
divide the subsequent one so that A is isomorphic to an infinite tensor product of 
matrix algebras. The simplest instance of UHF algebra A is a quantum spin-chain 
(see Sect. 7.4.5). In the case of the previously discussed infinite spin system, A is the 
quasi-local algebra A generated by the local algebras Aj_x 4] = Qka (M2(C))¢ 
which are tensor products of 2 x 2 matrix algebras at each lattice site. 


7.4.1 Bosons and Fermions 


Physical systems of quantum statistical mechanics usually consist of indistinguish- 
able particles and are described by operators of creation and annihilation satisfying 
either the CAR (5.64) or the CCR (5.94). More precisely, one considers the Fock rep- 
resentation built upon the existence of a distinguished vacuum vector | vac ) (which 
was considered in Examples 5.6.2.1, 2 for finitely many degrees of freedom). 

Let H be the Hilbert space describing a single Fermion or Boson and let {| p; )}ien 
be an ONB. Then, one introduces operators 


ai :=a(yj)  suchthat a(y;)| vac) =0 Vi eN 
a} :=a‘(;) such that a/|vac) =| v7). 


They are required to satisfy the CAR (5.64) if the particles are Fermions, the CCR 
(5.94) if Bosons. 

By expanding any |% )} € H along the chosen ONB, |) = J}; cil Yi ), one can 
consistently define creation and annihilation operators of generic | Y) € H: 


at@)=>laal, a) =Y cha. 


ieN ieN 
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This yields a(7))| vac) = 0, at (w)| vac) = | q) and 


La), a] = fat), at] =0, [a ato] = (419) (CCR) 
faw), ao} = faw, at} =0, faa, atO} = (WI 4) (CAR), 


for all |Y), | @) € H. Furthermore, by using these algebraic relations one gets 


a)l p) = a(h)a'd| vac) = (| @)| vac) . (7.6) 


For both Fermions and Bosons the number operator is defined by 


N:= oa} a. 


ieN 
Directly for Bosons and by means of 
[ajai a" (f)] = aj {ai, a" (f)} — {a}, a" (Nai = fia; , (7.7) 
where f; := (pi | f ), for Fermions, one finds that 


IN, =at, IN, a(f)]=—aCf). (7.8) 


The 

Fock space Hz for Fermions, respectively Bosons is generated by the com- 
pletion of the linear span of vectors of the form P(a(f), a’ (g))| vac) where 
P(a(f), a? (g)) is any polynomial in Fermi, respectively Bose annihilation and cre- 
ation operators. The Fermi operators a*(f) are bounded on the Fock space; indeed, 
from the CAR it follows that, for any normalized |W) € Hr, 


la’ (AIWF + la’ ALY I? = WIP? . 


The polynomials P(a(f), a(g) with f, g € Hy, where V is a finite volume, gen- 
erate, by norm completion, a local C*-algebra, AE : 


Remark 7.4.2 Given two volumes V; C V2 C R°, the local Fermi algebra can- 
not be isomorphic to Ae £ Aj, ® AF, where V3 := V2 \ V2. In fact, if | f) € 
Hy, and |g) € Hy,, despite the fact that ( f |g) =0, commutators of the form 
[a*( f), a” (g)] need not vanish. However, because of (5.65), commutators vanish if 
one considers polynomial with even numbers of creation and annihilation operators: 
the quasi-local C* algebra they generate is denoted by AČ. The quasi-local algebra 
C* generated by polynomial with a same number of creation and annihilation oper- 
ators is denoted by AF; it is known as even Fermi algebra and commutes with the 
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number operator. Indeed, using (7.8), it turns out that the number operator generates 
the gauge-transformation 


+00 ,. \k 
SoN at pein = CO iy, IW, LN, at 
— > ———— 


k=0 x k times 
(i i 
-54 a PAIN [N, at]---]] 
k—1 times 
iy ia + — „traia 
=r De =ela'(fy=ate"f) 19) 

j 

eN at (f) ei =e“ a( f) = alet f) . (7.10) 


Therefore, the various phases compensate each other in polynomials with equal 
numbers of a and a’; these are thus left invariant by the gauge-transformation for 
any a € R and must therefore commute with the number operator N. 

For Bosons, [a*(f,), a*(fo)] = 0 if f; € Hy, and V; N V2 = Ø; however, the 
operators a® ( f) cannot be bounded (see (5.71)) In order to construct local C* 
algebras generating a Bose quasi-local algebra A®, one associates to | 7) € H the 
bounded operators [130] 


a(w) +a a) 


wa) = exp (iS 


(7.11) 


which generalize the Weyl operators (5.98) and linearly generate a local Bosonic C* 
subalgebra AB by choosing ~ supported within the volume V. 


The C* algebras Apg, r are irreducibly represented on the Fock spaces Hpg, p. In 
order to show this one can use a similar argument as for the spin algebras 7, (A) 
discussed in the previous section. If X belongs to the commutant, X € Ab. p» then 
(7.6) yields 


(vac|a(gn) <- -at (g1) Xa'(fi)---a" (fn) vac) = 
= (vac | X a(gn)---a(gi)a' (fi) -- -a (fm) |vac) 
= (vac | X |vac) (vac |a(gn) <- -alga (fi) -- -a (fn) [vac) , 


where (see (5.195) and (5.193)) 


(vac | a(gn)-+- alga? (fi) +4" fm) vac) = e Croa a 


Therefore, X = ( vac | X |vac) Il whence the commutant is trivial; this means (see 
Lemma 5.3.2) that Ag r are irreducibly represented. 
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7.4.1.1 Quasi-Free Automorphisms and Quasi-Free States 
The gauge-transformation (7.9) is a particularly simple example of quasi-free auto- 
morphism. 


Definition 7.4.2 Every single particle unitary transformation U : H +> H gives rise 
to a quasi-free automorphism on A?: given by 


ula*(f)] =a*(Uf) . (7.13) 


Quasi-free automorphisms are typical time-evolutions of non-interacting particles 
possibly subjected to external potentials. They preserve the number operator; indeed, 
by expanding U| fi) = > j cij| fj ) with respect to the chosen ONB, it turns out that 


BIN] = alfa fi) = Y cijehatar = $ alaj, 
ieN i,j,k JEN 


for the matrix C = [c;;] is unitary. 
Examples 7.4.2 


1. Let {Ux}xeg: be the unitary group of space-translations, 


If) =e UI Hl), (rlfh)=fr-», 
for all f € Lå, (R3); then, {@x}xeR3 is the group of space-translation automor- 
phisms of APF: 
Oxla"(f)] = a" (fx) . (7.14) 


2. Let h : Ht H be a single-particle Hamiltonian with discrete spectrum, h = 
>>, cil Yi )( Yi | being its spectral decomposition, and set af := af (bi). There- 
fore, the basic annihilation and creation operators annihilate and create single- 
particle energy eigenvectors. Consider the second-quantized Hamiltonian H = 
Jci aj a; and the generated one-parameter group of automorphisms © := 


{Or}reR, 
aË (f) > Ola) se a e. 


By expanding and summing as in Remark 7.4.2 one finds that, for both Bosons 
and Fermions, 


eH a(f)e™™H =a f), eH at fje =a f), (7.15) 
for all a € C. Therefore, 
aË = a E P), (7.16) 


whence the group © is a quasi-free time-evolution. 
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3. Let us consider a single-particle Hamiltonian A with an absolutely continuous 
spectrum and the corresponding quasi-free time-evolution (7.16). For instance, 
the free-time evolution given by (p |h |f ) = p?/(2m) f (p) in momentum repre- 
sentation. 

In this cases, one can use the so-called Riemann-Lebesgue Lemma [305]. For an 
integrable function f : Rt C with integrable first derivative, it follows from 
integration by parts: 

+00 


fa fume" = Poe" 
R it 


f dv five”! — 0 
t JR 


+f dv PoE 
t JR 


when t — too. Because of the assumed absolute continuity of the spectrum of 
the single-particle Hamiltonian h, this lemma ensures that 


Jim Malf), atego = lim (fle g)|=0 (7.17) 


for all f, g € H for Bosons and 


Jim Malf), ate) = lim \(fle™g)|=0 7.18) 


for Fermions. While Bosonic annihilation and creation operators commute asymp- 
totically in time, Fermionic ones anticommute. However, by means of (7.7), one 
computes 


[a( falf), ate" g)] = al fi) leg) — (file galha). 


Then, ||[a(fi)a(f2), a‘ (ei g] — 0 when t — +00; furthermore, the same 
asymptotic commutativity in time holds for [X, ©;[Y]] where X is an even poly- 
nomial in a, at and Y any polynomial. By continuity, it extends to all X belonging 
to the even Fermi algebra A® and all Y € A’. Moreover, this result holds for all 
quasi-free automorphisms ©, (a# (f) = a (U, f) over A© consisting of a discrete 
or continuous group {U;}reR,z of single-particle unitaries U, : H > H such that 
lim;—>+o0( f | Urg ) = 0 for all f, g € H. This phenomenon is known as asymp- 
totic Abelianess. 


Definition 7.4.3 [81,130] A quasi-free state on A? is any linear functional w4 
such that w(ll) = 1 and 


1 
wa WO) = exp (—3(w (+24) |W), (7.19) 


where 0 < A € B(H) is a positive bounded operator on the single particle Hilbert 
space H = eA (R3). 
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A quasi-free state on A” is any linear functional w4 such that w4 (1) = 1 and 


wala? (fin) --+a"(fia(gi) +++ 4(Gn)) = Onm Detil gi | AI Fj), (7.20) 


where 0 < A < Il € B(H) is a single particle operator on H = La. (R3). 


The Fock vacuum satisfying (7.12) is the simplest instance of a quasi-free state; 
the one with A = 0. Like classical Gaussian states, quasi-free states also can be 
reconstructed from their two-point correlation functions 


wa(a'(f)a(g)) =(g|Alf) Yf.geH. (7.21) 


This property results directly from the determinant in (7.20) for Fermions, while for 
Bosons it can be proved by showing that (7.19) leads to 


wala? (fn) a (fida(gi) +++ 4(Gn)) = Onm Perl( gi | Alf; )1, (7.22) 


where the so-called permanent is as in (5.195); indeed, (7.19) is a generalization of 
(5.196) to the infinite dimensional case. 


Example 7.4.3 [81] Suppose a quasi-free state satisfies the KMS conditions (5.189) 
with respect to the quasi-free time-evolution (7.16); using the commutation relations 
and (7.15), it turns out that 


walal(fa(g)) = (g ALF) = wala(g)@isla'(f)) = walala e f)) 
= (ge P” |f) + walat (e fag) 
= (gle f” + Ae" If). 


Since this is true for all f, g € H it turns out that (compare (5.192) and (5.194)) 
—8h 
en 


= 1 = e Bh? 
Bosons. 


Ax 


where the plus sign holds for Fermions and the minus sign for 


7.4.1.2 KMS States and Modular Theory 

We have seen that Gibbs states of finite dimensional quantum systems satisfy the 
KMS relations (5.189). These relations can be extended to infinitely many degrees of 
freedom where they identify equilibrium states at a given temperature [159] (a simple 
instance of this fact was offered in the previous example). Unlike with finitely many 
degrees of freedom (see Remark 5.6.1.1), there can be more than one equilibrium 
state at inverse temperature 8. An equilibrium state is called extremal when it cannot 
be decomposed into a linear convex combination of other equilibrium states at the 
same temperature; extremal equilibrium states give rise to factor representations and 
can be in some cases rightly identified as pure thermodynamical phases [81,353]. 
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Given a triplet (A, ©, w), with a faithful state w. The latter is said to be a KMS 
State at inverse temperature (3 with respect to the automorphism © if the functions 
(compare (5.188)) 


Fyy(t) := w(O;[X]Y), Gry) :=w(YO,[X]) VX,YeEA, 


can be extended to analytic functions Fyy(z), respectively G xy (z), on the strips 
—B < 3(z) < 0, respectively 0 < 3(z) < 8, and continuous on their borders, where 
they satisfy 


w(O[X]Y) = wY Oripl X) . (7.23) 


We outline a few of the many properties of KMS states [353] (for a more detailed 
analysis see [81, 130]). These properties involve the GNS cyclic representation 7, (A) 
on the GNS Hilbert space Hy and the GNS implementation of © by a unitary 
operator Uw. 


Remarks 7.4.3 


1. When extended to the von Neumann algebra ty (A)”, a KMS state w remain KMS 
in the sense that 


(Ru | X USOY |Qy) = (Ru | Y USO + iB)X|Q,) YX, Y € TA)" , 
2. KMS states are ©-invariant: indeed, (7.23) implies 
fx) = w(O,[X]) = w(@r+i6[X]) = fx(t + iP) 


forall X € A. Thus, fx (t) can be periodically extended over the whole of C where 
it defines a bounded analytic function for f(t) is bounded on the strip —G < 
3(z) < 0. Therefore, this function must be constant: fx(t) = fx(0), whence 
O,[X] = X for all X € A. 

3. The center Zy = nu (A)” N 7,,(A)’ of the GNS representation based on a KMS 
state w consists of © -invariant global observables. Indeed, if T € Z,,, by the same 
argument as in the previous point, the function 


fx TE) = (Ru |T OLX T IQ.) = (Ru |T Omil X] 120) 
= (Ru | ToO pl XIT 2.) = fx,rt + ip) 


can be extended to a bounded analytic function over C, for all X € A. Then, it 
mustbe fy r(t) = fx,r(0).Sincet € Z,,, choosing X = Y Z,Y,Z € A, yields 


friz rE) = (Qo | Tu) UOT ULO mlZ1| Ro) 
= fytz 70) = (2u |T)? T m(Z) (Qu) 


on a dense set, whence U,,(t) T Us (t) =T forallt € R. 
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4. For fixed inverse temperature, the KMS states form a convex set which is compact 
in the w*-topology [353]. 

5. A KMS state is said to be extremal KMS if it cannot be written as a convex 
combination of other KMS states (at the same inverse temperature). The GNS 
representation based on an extremal KMS state is a factor: Z,, = {A1}. If not, there 
would exist 0 < Ti 2 € Z, with Ti + T = ll which could be used to construct 
the states 

( Qu | Tita (X) | Qu ) 
(Qu | Ti |2u ) 


which turns out to be a KMS state with respect to © at inverse temperature (3. 
Indeed, 


wi(X) = 


(Ru | Tim (OLX) 1.) 12a ) 
(Ro | Ti |u) 

(Rw |Tv (OLX)Ti TY) 2u) 
(Ro | Ti |u) 

(Rw | Tita) (Or+ial[X]) |u) 


A = w; (Y Or4iplX)) . 


wi (O: [X]Y) = 


The modular theory or Tomita-Takesaki theory, which has been introduced in its 
simplified finite-dimensional version in Sect. 5.5.4, extend to generic von Neumann 
algebras M C B(H) with a faithful state (see Definition 5.3.2) [80]. More precisely, 
given a quantum triplet (A, ©, w), if the GNS state is such that 


X| 2u) = 0 = X=0 YX E TA", 
then there exists a modular conjugation J : Huy +> Hw, such that 
P=, JAR) =|), Jo TA Jo = TA) , (7.24) 
and a modular operator A,, : Hu > He such that 
Joy Aged | Ru) = X| 2u) YX en A)’. (7.25) 
Furthermore, the maps 
a i To (A)" exe AË X AZ" (1.26) 


form a group {o1,},er of automorphisms, called modular group; moreover, they 
satisfy the KMS conditions 


(Quy | XY |Zu) = (u| Yo, (X) |Q,) YX, Y E€ TA", (7.27) 


that we will shortly write as w(XY) = wY oi (X)). 
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Example 7.4.4 IfM C B(H) is an Abelian von Neumann algebra (M © M’) with 
a cyclic vector | 2 ), then it is maximally Abelian. In fact, | 2 ) is necessarily cyclic 
also for the commutant M’, thence separating for the bicommutant M” = M (see 
Lemma 5.3.1). Then, (7.25) gives 
JV AX] 2)? = |X" RN? = (2 XX" R) 
= (2|X'X|2)=|1X|/2)I)’, 


for all X € M since M is Abelian. Therefore, A = Il and JXJ = XÏ, whence 
(7.24) yields M = M’. 


Example 7.4.5 A most used GNS representation [353], is the so-called thermal rep- 
resentation whose cyclic and separating vector is the tensor product of two vacuum 
states, | 23) = | vac) Q | vac ), so that the GNS Hilbert space is isomorphic to the 
tensor product of two Fock Hilbert spaces. 

We shall consider the framework of Examples 5.6.2.2, 3 without restrictions on 
the dimensionality of the single particle Hilbert space and on the cardinality of the 
spectrum of the single particle Hamiltonian A. We shall denote by a*, respectively 
b”, Bose, respectively Fermi, creation and annihilation operators. In the Bose case, 
their action on | $2, ) is 


mgla(f)) =: ag(f) =al VIFA f) 1 + 18a GVA f) 
Tea) = aS) =I Af) @ + 1@ajVA_f), 


where the single particle operator j : H > H is antilinear and satisfies (jf | jg) = 
(g| f), while in the Fermi case 


TAOS) =: baf) = b1 -A4 f) B1 + 0a VALS) 
TAEC) = bA) =b VI-A f) @ 1 + 08aGVA f), 


where @ is an operator on the Fock space such that 0 b* = —b* 0 and 0| vac) = 
| vac). Then, 


ah(f)| 2a) =|/1+A_f)@lvac), ag(f)| Qs) =|vac) @|jVA_f) 
DPR) =|V1—Ayf) @lvac), bg(f)| Qs) =| vac) @|jVAyf). 
whence 

(2g |a}(f)ag(g) Re) = (jv A-f | iVA-g) = (9 A-If) 

(25 | bi (f)ba@) 12a) = (iV As FI iV As) = (G1 AGI) - 
The modular operators read 


A; Z eA Ei alai Q ath E; Ei alai jys = sP; Ei bi bi Q etb Xi Ei bib; 
È ’ f E 
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where a? and bt create or annihilate eigenstates |£; ) of the single-particle Hamil- 
tonian h. By means of calculations similar to those that led to (7.9) and (7.10), one 
explicitly calculates 


Aj; ag(f)| 23) =| vac) Q| je?" /A_f) 
Abba(f)| 2g) =| vac) Q | je" Ay f) . 


One can thus explicitly evaluate the action of the modular conjugation; from (7.25), 


Joa PN 23) = J Aza) Qe) = |e"? YTF A_f) @ | vac) 
=|/A_f) ®| vac) 

Jeah fl 23) = J Azal Qa) =| vac) @ | jef JAF) 
=|vac)@|j/l+A_f), 


in the case of Bosons, while for Fermions one obtains 


Ja bef) 2a) = AZOA Re) = |e 8"? — Ay f ) @ | vac) 
=|VA+f) @| vac) 
Ja bi(f)| Re) =f At ba(f)| 2a) = vac) 8 | je?" VA, f) 


=|vac) @|jyl—A,f). 


Since thermal states are faithful, | $23 ) is cyclic and separating, therefore 


Jgag(f) Je =a'(/A_f) @ 1 +1 @a(j/1 + A_f) 
Jga\(f) Je =aQ/A_f)@1+1@atiGV1+A_f) 


for Bosons and, for Fermions, 


Jabel f) Ja = VA f) ® 1+ 08 dj — Ay f) 
Jab) Ja = DVA f) @ 1+ 0@d'GVI—Ayf). 


The thermal representation has been used in [253] to implement the transposition in 
an infinite dimensional context and study the entanglement properties of infinitely 
extended quantum systems (see also [364]), the starting point being (5.159) in the 
finite-dimensional case. Let V the flip operator which exchange vectors in tensor 
products V| 8 Y) = |Y @ ¢), then 


VJ, X? @lyI,V=xX’ @ 1, 
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Analogously, in the thermal representation one may represent the transposition as 
follows 


yp las f) = Vİ Jaah (f) Ja V- = al GIFA f) @ 141 a VA f) 
pba Pl = Vi Jab) Ja V+ = DUVI- Ay f) @ 1408 bA f). 


Among the CPU maps on a C* algebra A, a special role is played by the condi- 
tional expectations (see Definition 5.2.5). Suppose the orthogonal projections P; in 
Example 5.2.9.1 commute with a given density matrix p € B i (H), it then follows 
that 


po E(X) = Tr(pE[X]) = ` Tr(P; p P; X) = Tr (x Pip x) = p(X) 


icl icl 


for all X € B(H) for )°;.,; Pi = 1. One says that the conditional expectation from 
B(H) onto the Abelian subalgebra P C B(H) generated by the P; respects the state p. 
Also, notice that if p is faithful then P is left invariant by the modular automorphism 
(5.190), that is T, [P] = P. This is the key point how to extends these considerations 
to the case of general von Neumann algebras with faithful normal states, where 
conditional expectations are identified with normal projections of norm one (see 
Remark (5.2.8)). We state the result, for a proof see [345]. 


Proposition 7.4.1 Let A be a von Neumann algebra, w a faithful normal state with 
associated modular group of automorphisms o!,, t € R. Moreover, let Ag C A be 
a von Neumann subalgebra and wo the restriction w\Ag with associated modular 
automorphisms Ty Then, the following conditions are equivalent: 


1. there exists a normal conditional expectation E : A +> Ag that respects the state, 
woE =w; 

2. ol [Ao] E Ao for all t € R; 

3. ot [A] = of [A] for all A € Ao. 


wo 


7.4.2 GNS Representation and Dynamics 


Quantum dynamical systems will be identified as non-commutative algebraic triplets 
(compare the analogous commutative Definition 2.2.4). 


Definition 7.4.4 Quantum dynamical systems are triplets (A, ©, w), where A is a 
C* algebra with identity 1, the dynamics © corresponds to a group of automorphisms 


O, : At A,t € G, such that 


O; o O; = O; o0 Or = Ons, woO,=w, Vs,teEG, 
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where G = Z or G = R and the state w : At C is a normalized, positive, ©- 
invariant expectation, namely w o ©; = w for allt € G. 


Given an algebraic triplet (A, ©, w), a natural Hilbert space formulation is based 
on the GNS construction (see Definition 5.3.7); it does provide not only a represen- 
tation 7,,(A) on a Hilbert space H, with a cyclic invariant vector | 2 ), but also an 
implementation of the dynamics by a group of unitary operators. 


Proposition 7.4.2 ([129,353]) Let (A, @,w) be a C* dynamical system and 
Hu, Tw, Ru) the associated GNS triplet, then, the C* automorphism © is imple- 
mented by a unique unitary operator U,, : Ho > Hu, 


Tw (O(X)) = ui Tw(X)U, YXEA. (7.28) 


Proof Given the GNS representation Tu, 7 := Tw o © is another representation of 
A on HH, such that 


(Ru | TX) 2w) = (Ru | Tu (O (X) [2u ) = w(O(X)) = w(X) . 


Therefore, Remark 5.3.2.1 ensures the existence of a unitary operator U, such that 
(7.28) holds. If another unitary operator W with the same properties exists, then 


[wt Uw, To œ| Tw(Y)| 2u) = 0 for all Y € A, namely on a dense set; therefore, 


WU, belongs to mu (A)' for which | 2u ) is separating (see Lemma 5.3.1). Then, 
W =U,, since (W'U, — 1)| 2u) = 0. 


Remarks 7.4.4 


1. If the dynamics is specified by a one-parameter group of C* automorphisms 
{©;};eR which is weakly-continuous in the GNS representation, then the group 
U.(G) := {Ua(}reg, G = R, is strongly continuous on Hw. In fact, the previous 
Proposition asserts that each ©; is implemented by a unitary operator U,,(t); 
furthermore, 


US US (8) mw (X)| Qu) = UL OTO (XD Qu) = Tw(@r45(X))| Qu) 
= US +5)my(X)| Qu) 


on a dense set, whence the family {U.,(t)};er forms a one-parameter group of 
unitaries on H. Strong continuity follows from weak continuity and 


Uo) = DDI Ru)? = (WX X) -RW O) « 
2. The Fock representation is unitarily equivalent to the GNS representation based 


on the vacuum state: w(a(f)) = ( vac |a( f) |vac) = 0 for all | f ) in the single- 
particle Hilbert space H. Suppose © is a quasi-free Fermi automorphism as in 
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Example 7.4.2.3; let V : Hr —> Hr be the unitary operator that implements it on 
the Fock space. Since the number operator is left invariant by quasi-free automor- 
phisms, if V belonged to Ap, then it should also belong to the even Fermi algebra 
Ag. Further, from asymptotic Abelianess, the invariance of the norm under uni- 
tary transformations and the fact that the various V, commute, it turns out that, 
for any £ > O and X € Ar, 


ILV, Os[XI] = IIVI V; X Vs — VÝ X V; Voll 
= |V X — X Vill = |X — V XVI <e 


for all t € R. This cannot be true for all X € Ar so that the unitary operator 
V € B(Hpr) does not belong to Ap. However, since Apr is irreducible (see (7.12)), 
V belongs to the bicommutant ABF = {ACY = Bp). 

3. Very rarely, starting from the Hamiltonian of a system of N interacting particles 
and going to the thermodynamic limit, one obtains a norm-continuous dynamics 
at the C* algebraic level, that is independently of a given time-invariant state. 
Usually, the dynamics exists only in the GNS representation provided by that 
state, however, an instance of Galilei invariant interaction which gives rise to a 
norm-continuous group of automorphisms of the CAR algebra can be found in 
[353]. 


Example 7.4.6 (Infinite Dimensional Quantum Cat Maps) The finite dimensional 
quantization of the torus T? studied in Example 5.4.3 can be turned into an infinite 
dimensional one by lifting the condition (5.88), namely the quantum counterpart of 
the folding constraint (2.15) in Example 2.1.3. Concretely, the Weyl relations (5.85) 
become 


UVa = eft i Onin ve UF f Qe [0, 1) , 


where U and V are two abstract unitary operators and n = (n1, n2) € Z?. Notice that 
26 plays the role of 1/N in Example 5.4.3 and is a continuous deformation param- 
eter: when 0 = 0, the commutation relations are those that hold for the exponential 
functions (2.21), namely 


Enem = men - 


Then, as in (5.86), we define the unitary Weyl]-like operators 
Wom) := e7” 70mm Us Va: nez? 
that satisfy relations similar to those in (5.87), 
Wo(n) Wọ(m) = e” 7000m Wont im), Yn, me zZ?, (7.29) 


with symplectic form o(n, m) := nım — nam}. Also, in analogy with (7.11), one 
sets 


Wo(f) = >) f) Wm), (7.30) 
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where | f) = {f(M)}nez2 belongs to the subspace ,(Z7) C (Z?) of square- 
summable sequences with finitely many non-zero components. We shall call support 
of f the set 


Supp(f) := [n eZ: fms o} (1.31) 


The following properties hold for all f, g € 44(Z?), 


Wo(f)’ =Wo(f"), f'o) = f(-n)* 


Wa(f)Wo(g) = Wolf * g), with (7.32) 
(f* gm) := $ e70 fn — m)g(m) . (7.33) 
meZ? 


Consider the x-algebra Až := { Wat fife *(Z*)} generated by all possible lin- 
ear combinations of Weyl operators Wọ( f) with f € £* (Z?). Let then w denote the 
linear functional w : Aj +> C such that 


w(Wo(n)) = ono - (7.34) 
Using (7.32) with (7.33) one checks that 
w (Wo(f)' Wo(g)) = TDO = D> fng = (fig). 1.35) 
meZ? 


Thus w is a positive normalized functional, namely a state on Aj similar to wy in 
(5.197). Consider the associated GNS representation Tw (Aj) and set 


Ifle:=m(WaP 2) VWif)eeC), 


where | §2,,) is the cyclic GNS vector. Then, the vectors |n)g form an ONB in 
Ho and (n| f) =(n| f) = fm). Therefore, Ho = (Z3) = L4, (T°), inde- 
pendently of the deformation parameter 6; also 


Tu (WoC PIG) =| f *g9). (7.36) 


The x-algebra A% can be equipped with a *-automorphism which extends to the 
present case the dynamics discussed in Example 5.6.1.4: it is defined on the Weyl 
operators by 


O,[Wo(n)] = Wo(A'n) , (7.37) 


where A is a 2 x 2 integer matrix as in (5.186). The state w is left invariant by Oa 
which is then implemented by a same unitary operator U,, for all 0 € [0, 1) that 
coincides with the Koopman operator Ua of Example 2.1.3. Indeed, 


OalWo(f)1= >> f0) We(ATn) =} f(A~Tp) Wo) = Wala f), (7.38) 
n P 
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for f (AT n) = (Ua f)(n) (compare (2.1.3)). Consequently, 


w (Wo(f)@alWo(g)]) = (2 | Tu (Wolf)? Uo Tu WAD) |2 ) 
= w(Wo(f)We(Ua g)) = ( f 1Ua lg). (7.39) 


The dependence on 0 € [0, 1) emerges when considering the closure of mu (Aj) 
with respect to strong-operator topology thus obtaining von Neumann subalgebras 
Meg © Ho). 

While the GNS Hilbert space and the unitary implementation of the dynamics are 
the same for all von Neumann dynamical triplets (Mg, Oa, w),! for 8 = 0 Mg is 
isomorphic to the maximally Abelian von Neumann algebra L3? (T?) of essentially 
bounded functions on T? (see Sect. 5.3.2), for 0 irrational Mg is a hyperfinite JJ, 
factor, while Mg is not a factor and of finite type J, when @ is rational. 

Let us consider the case 0 = 0; clearly, Mo is an Abelian von Neumann sub- 
algebra of B(HL,), actually, maximally Abelian since it has a cyclic vector | Rw ) 
(see Example 7.4.4.1); therefore, it is isomorphic to LY (T?) via the argument of 
Theorem 5.3.3 and a one-to-one mapping 


en > [en] = Wom) (7.40) 
between the Weyl operators Wọ (n) and the exponential functions (2.21). Indeed, one 
observes that the multiplication of vectors in Li, (T3 by en and the action (7.36) of 
Wo (n) on vectors in Ha do coincide. We shall thus identify Mo = Lo? (T3. 

Consider now the case of a rational deformation parameter, 0 = p/q, p,q € N; 
then, (7.29) yields 
Wp/a (qn) Wp/jq (m) = ETTORI Wp/q(qn + m) = Wp/q(m)Wp/q (qn) , 
for all n, m € Z2. Further, set 


Z? ə n = (n1, n2) = [n]+ < n >:= ([n1]+ < nı >, [n2]+ < n >), 


where, for any n € Z, [n] = q m denotes the unique multiple of q such that 0 < 
n — [n] =:< n >< q — 1. Then, one gets 


Wpq@) = Wpa(m]) Wp/q(< n >). 


! ©, and w denote the extensions of the automorphism ©, and of the state w from Aj to the 
strong-operator closures Mog. 
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As a consequence, when @ is rational, every Wp/q (f) can be written as 


Wolf) = E (SO fant <n >) Wp) Wpjq(< n >) 


<n>€J(q4) [n] 


= Š. Xp) Wpjq(s) with (7.41) 
seJ(q) 

X p(s) := Do f(qn+s) Wp EMO? , (7.42) 
nez? 


J(q):= [s = (81,52): 0< si <q- i and where 
M® =| F f(a) Wyiq(am)| (7.43) 
neZ? 


denotes the von Neumann subalgebra of M p/q linearly generated by the Wey] oper- 
ators of the form Wp; (qn), n € Z?. Because of (7.41), they commute with M p/a 
whence M™ belongs to the center of M p/q- 

Moreover, the exponential functions of the form Wọ (qn) fulfil 


Wo(gn)(r) = Wo(qny(r+s/q) Ys e Jq). (7.44) 


They generate a *-algebra whose strong-closure is a von Neumann subalgebra 
Me C Mp of essentially bounded functions f on T? such that 


FO = WIM := fr+s/q), (7.45) 


for all s € J(q). Let Tis : Mo > M be defined by 


Mo > f > TsLf1:= X f(qnts)Wo(gn) e MỌ , (1.46) 


neZ 


where s € J (q). By decomposing 


f=} fe) Wom) = Y (X fem +s) Wo(qm)) Wo6), 


seJ(q) m 


it follows that II, can be recast as 


1 . 
Mif] = Wos) -z X Per, (7.47) 
teJ(q) 


Furthermore, a map similar to the one in (7.40), 


Wo(qn) +> ®,[Wo(qn)] = Wp (qn)  YneZ’, (7.48) 
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makes M and Me isomorphic so that (7.42), respectively (7.41) read X f(s) = 
@,[Ts[ f]], respectively 


Wp P= X Pal IIslf Il Wp) - (7.49) 


seJ(q) 


Concluding: (1) M p/q is not a factor, (2) due to the finitely many non-commuting 
Wp/q(s) with s € J(q), the type of M p/g is finite I, (see the discussion of types 


preceding Sect.7.4) and (3) M p/q is hyperfinite for such is Mo (and thus Mo) 
according to Remark 7.4.1. 
For @ irrational, Mọ is a factor; indeed, from (7.29), 


[ Worn), Wom) | = 2i sin(2r Oc(n, m)) Wọ (n + m) (7.50) 


cannot vanish for n 4 m, whence the center Z = MgN M, is trivial, that is it 
consists of multiples of the identity only. Since the state (7.34) is a trace on Mo, 
according to the discussion following Definition 7.34, Mọ is a type II factor and 
also hyperfinite [310]. 


In the commutative setting of Example 2.2.3, the unitary U, corresponds to the 
Koopman-von Neumann operator which cannot belong to the commutative von Neu- 
mann algebra Moy = Lr (X). 

As much as in this case and unlike for finite level quantum systems, the quantum 
dynamics is typically implemented by unitary operators which map the algebra of 
observables into itself, indeed 


US (t) Tu (X) Unt) = Tu (Or(X)) € ToCA) , (7.51) 


without U,,(t) itself belonging to mu (A). Given Tu (A) and U,,(G), one can however 
consider all linear combinations of products of operators in 7,,(A) and elements 
U.,(t) which can always be reduced to the form (compare Example 5.3.2.7 for a 
similar structure) 


> mu(Xi) Ulti). (7.52) 
iel 
These elements form an algebra, denoted by {mu (A), U,,(G)} whose bi-commutant, 
namely its strong closure on H, turns out to be a useful tool to discuss ergodicity 
and mixing in quantum dynamics. 


Definition 7.4.5 (Covariance Algebra) Given a quantum dynamical system 
(A, ©, w) and the GNS implementation of the dynamics, the associated covariance 
algebra is the von Neumann algebra Ru := {nru (A), Ua(G)}". 
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As we have seen in Sect. 5.3, besides the von Neumann algebra Ro itself, what 
is also important is its commutant; in particular, in the framework of the GNS con- 
struction, for what concerns the convex decompositions of the reference state w 
(see Remark 5.3.2.3). As regards the covariance algebra Ra and its commutant R’, 
notice that if X € B(H,,) commutes with Ru, it must commute with both m, (A) and 
U..(G). Vice versa, if X € B(H,,) commutes with mu (A) and U,,(G), by continuity, 
it also commutes with the von Neumann algebra generated by them. Therefore, 


R!, = Tu (A) N Us(GY , (7.53) 


where U,,(G)’ is the algebra of the bounded constants of the motion, that is the 
algebra of all X € B(H,,) such that Us (t)? X U,,(t) = X. 


Example 7.4.7 For the case of Example 5.6.2 the covariance algebra is 


n 


Ry = (TM) U VR) = (MO 8 ta Up" 80) 
= M2(C) 8 {12, 03}, 


where {Il2, 03} stands for the commutative algebra of 2 x 2 matrices which are 
diagonal in the eigenbasis of p. Furthermore, the constants of the motion are contained 
in U,(R)’ = {Ilo, 03} Q {llo, o3} and 


R, = Tp(M2(C)y' N U, RY = 12 8 {12, 03} . 


7.4.3 Quantum Ergodicity and Mixing 


As seen in Sect. 2.3, ergodicity corresponds to a specific behavior of the time-averages 
of two-point correlation functions. Given a quantum dynamical triplet (A, ©, w), 
two-point correlation functions have the form w(A@;(B)) where A, B € Aandt € 
G, where G = R or Z. For sake of concreteness, we will consider averages or 
invariant means as in (7.3), namely of the form (compare Definition 2.3.1) 


T—1 
Xo w(A'O,(B)C) (7.54) 


m [w(A'@,(B)C)] = jim sea 
t 


in discrete time, or else 


T 
n [w(A' @,(B)C)] = jim, FJ,” w(A'@,(B)C) if G=R. (7.55) 


2 For more details on the existence of invariant means see [129]. 
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Because of (5.51), it turns out that these averages are bounded, 


m [W(t DO] < NAIBII 
Also, in the GNS construction based on the invariant state w, the averages 

m [VA 0: B)C)] = m [( Ro | mA) ULO TulB) Uolt) mo(C) |Ru)] , 
provide bounded sesquilinear forms on a dense subset of the GNS Hilbert space Hw. 


This observation together with an argument similar to that in Example 5.3.2.3 lead 
one to introduce 


1. an operator ny [Uu] € BCH.) with matrix elements 


(Y |n Uo] 1d) =m KY Ul) Yy, o EHu; (7.56) 


2. alinear map ny : A œ> BCHL,,) defined by 


(Y Ino [A] lO) = m [YIU OTAU] Yy peu. (7.57) 


Notice that, because of ©-invariance, w(7j, [X]) = w(X). 


Examples 7.4.8 1. Consider a finite-dimensional dynamical system described by 
a finite-dimensional Hilbert space H = C4, by observables that are matrices in 
Ma(C) and by a Hamiltonian H assumed to have non-degenerate eigenvalues 
eq = eq—1 = eọ = 0 and eigenvectors | j ), j = 0, 2, d — 1. The dynamics is thus 
given by (see (5.180)) 


d—1 
U, =e FF = 10/01 + rete sil 
j=l 


d—1 
XX SUL rus J MSMR XIE, 
j,k=0 


for all X € M4(C). Then, the time-average 77 yields 


d—1 


nU) =10X0], n(X)= IXIAS- 


j=0 


Thus, 7: Ma (C) +» Ma(C) amounts to the conditional expectation (see Exam- 
ple 5.2.9.1) onto the Abelian subalgebra of M4(C) generated by the minimal 
projections | j )( j |. 
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2. All the eigenvectors | j ) in the previous example provide, //;-invariant expectation 
functionals w; on Ma (C). The corresponding irreducible GNS representations 
Twj (Ma(C)) (see Sect. 5.5.4) act on a Hilbert space isomorphic to C4 with GNS 
cyclic vectors of the form | 2; ) = | j ) & | j ). The dynamics is implemented by 
unitary operators of the form 


Uw; (t) = e tH Q eltéj la , 


so that they preserve | §2;). It follows that nw, [U] = | 2; )( 2; |, while 
te; [X] = n(X) 8 |j )(j | for all X € Ma(C). 

3. Consider the projection P, onto the subspace of vectors | Y) € Ho such that 
Usl Y) = |y) (the GNS vector | Ru) is certainly one of them). Then, since 
w, @ € H, are arbitrary, 


(W]e Uw) Polo) =m (| Gat) Polo) = (Y| Po lo) 
implies ny [Uu] Po = Pu; analogously, Po nw [Uu] = Pu. Moreover, 
(Y | Uolo [Uw] 10) = m UY | Uot s) 16)) = Y lno al le), 


whence nu [Uw] | ġ ) is invariant. Thus, Ponu (Uw)! o) = no [Uu]| ġ) forall d € 
Hao and then ny [Uu] = Po mw [Uu] = Po. 

Notice that nu [Uu] belongs to Rw, for it arises from averaging correlation func- 
tions; furthermore, since it equals P,, it does not depend on the specific invariant 
mean used. 


While the average of the time-evolution U,,(t) gives rise to the projector onto 
U.,(t)-invariant vectors (compare Proposition 2.3.4), the average of quasi-local 
observables A € A transforms them into global constants of the motion, namely 
into observables that belong to the strong-closure of A and that are left invariant by 
the dynamics. 


Proposition 7.4.3 n, [A] © m,(A)”N U (GY. 


Proof We check this on the dense subset 7(A)| 2u) C Hl. Let X belong to A and 
X' to the commutant Ty (A)’, then, using (7.51), 


(Ru | (A) LX] XTC) 2u) = 
= m [( Qu | To lA) ToO (XY) XTC) [Qu ) | 
=m [( Qu | m(A)! X’ m(O;(X)) TWC) [Qu )] 
= (Ru tA) X No [X] TWC) 2u) . 


7.4 Observables, States and Dynamics 389 
Thus, nu [X] commutes with 7,,(A)’ whence 7, [A] © 7(A)”. Similarly, 


(Qu | To lA) Mw [X] UST) Ru) = 
= m [( Qu | Tol A UI OTXU olt + 8) ToC) [Ru )] 
=m [( Ruw | Tu lA) Us ULE + 5)mu(X)UL(t + 8) TWC) |Qu)] 
= ( Ru | To (A) Un (S) No [X] ToC) 2w) , 


for all X € Aand s € G, whence mu (X) € U,,(G)’. 


Example 7.4.9 We have just showed that X € A => nw [X] € U,,(G)’; also, by 
construction (see Example 7.4.8.3), P) = nu [Uu] € Uu (G)”. Then, it follows that 
[nw [X], Po] = 0, whence 


(Ru | To lA No [X] Po Tu(B) Qu) = 
= M K Rw | Tu (A)" Po Tu lOi (X)) Po M(B) |Zu )] 
= (22, | T(A)" Py Ty (X) Po TM (B) |2% ) 


on a dense set. Therefore, for all X € A, it holds that 
Tw [X] Po = w Nw [X] = Po W(X) Po VXEA. 


In the case of classical ergodic systems, these latter systems are equivalently iden- 
tified by the clustering properties of their two-point correlation functions (Proposi- 
tions (2.3.2) and 2.3.9), by the spectral properties of the corresponding Koopman 
operator (Corollary 2.3.1) and by the extremality of their invariant states (Proposi- 
tion 2.3.8). Clustering in the mean as in (2.65) or (2.75) and extremality are notions 
that readily extend to the non-commutative setting. 


Definition 7.4.6 A quantum dynamical system (A, ©, w) is -clustering if 


m [w(A@,;(B))] = w(A)w(B) VA, BEA, (7.58) 
clustering if 
jim w(A O,(B)) = w(A)w(B) VWX,YEA. (7.59) 


A state v on Ais extremal if v = Av; + (1 — A)r withO < à < 1 and v 2 states on 
A implies vı 2 = v; v is an extremal@-invariant if it cannot be written as a convex 
sum of other @-invariant states on A. 
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1. If m,(A)” N U,(GY = {A 1}, where {À 1} denotes the trivial algebra consisting 
only of multiples of the identity, then the global constants of the motion are trivial. 
It follows that (A, ©, w) is n-clustering. In fact, in such a case ny [B] = w(B) 1 
for all B € A, whence 


m [w(A@,(B))] = (2w | t.(A) nw [B] [2u ) = w(A)w(B) . 


2. Suppose the invariant state w in (A, ©, w) is not extremal invariant; then, there 
exists a state v on A such that Av < w, for some 0 < à < 1, and vo O =v. 
Thus, from Remark 5.3.2.3, Av(X) = ( Ru | T'To(X) |Q., ) for all X € A, with 
0 < T’ € Ty (A). It turns out that T’ € U,,(G)’, too; indeed, O-invariance yields 


(Ro | Tol A) T’ USOT (B) | Qu) = (Ro | Tw (A)Ý T' Tu (OB) |Qu ) 
= Av(At @_,(B)) = Av(@,(A)* B) 
= (Qu | Tol A)? Uolt) T' ToB) (Qu) , 


on a dense set. Therefore, T’ € R}. 


Consider now the following list (we shall refer to it as ergodic list in the following) 
of statements [129] concerning clustering, extremal ©-invariance and the spectral 
properties of the dynamics of quantum dynamical systems. 


w is extremal invariant (7.60) 
Ri, = {Al} (7.61) 
Rw A Ri, = {All} (7.62) 
(A, @,w) is n-clustering (7.63) 
P, is a one-dimensional projector (7.64) 
T(A)Y NA Ri, = fA} (7.65) 
Nw [XJ =w(X) VxeEA (7.66) 
w is the only normal invariant state on 7,,(A)” (7.67) 
| Qu) is the only invariant vector state in 7,,(A)| 2.) . (7.68) 


From Remark 7.4.5.2 it follows that (7.61) ==> (7.60); also, if R’, is not trivial, then, 
as in Remark 5.3.2.4, w can be decomposed into a convex combination of @-invariant 
states. Therefore, (7.60) <=> (7.61). If w is not extremal invariant, the corresponding 
operator Il 4 T’ € R’, provides an invariant vector state U.,(t)T’| Qu) = T'| Qu), 
thus | 2 ) is not the only invariant vector state and the projector P, onto the invariant 
subspace of HH, cannot be one-dimensional, whence (7.68) ==> (7.60) and (7.64) 
=> (7.60). 


7.4 Observables, States and Dynamics 391 


Furthermore, using Example 7.4.8, one gets 


mM [wW(AO,(B)] = (2 | Tu (A) Nu [Uw] ToB) |2u ) 
= ( Ru | Tu (A) Pu Ta (B) (Qu); 


together with P,,,| 2.) = | 2) and 
w(A)w(B) = (2, | To l(A) | Ru) Ru | M(B) 2a), 


this yields (7.64) <=> (7.63). 

From Proposition 5.5.2 it follows that the normal invariant states p on Tu (A) 
must correspond to density matrices p € B ; (Ho) that commute with U,,(G); thus, 
p(w LX) = p(m.(X))) for all X € A and, if nu [X] = w(X) 1, then p(n. [X]) = 
P(T (X)) = w(X), whence (7.66) => (7.67). Also, from Remark 7.4.5.1, (7.66) 
=> (7.63). 

Other implications are (7.61) = > (7.62) => (7.65) and (7.67) => (7.68). Sum- 
marizing, 


Proposition 7.4.4 With reference to the previous list of ergodic properties, the fol- 
lowing implications hold 


(7.64) <=> (7.63) <= (7.66) => (7.67) 
y y 
(7.65) == (7.62) <= (7.61) = > (7.60) <= (7.68) 


While in a commutative setting the properties (7.60)-(7.68) are equivalent and 
each of them identifies an ergodic system, it is not so in the quantum realm, unless 
(A, O, w) is asymptotic Abelian [129,353]. 


Definition 7.4.7 (Asymptotic Abelianess) Suppose (A, ©, w) enjoys one of the fol- 
lowing properties 


m [w(At [B, @,(C)] D)] =0 VA,B,C,DEA (1.69) 
lim w(A'[B, @(C)]D) =0 YA, B,C,DEA (1.70) 


lim w(A‘ [B, O (OJ [B, @.(C)] A) =0 VA,BeA (7.71) 
lim ||[B, @,(C)]|| = 0 VB,C eA. (7.72) 


In the first case, (A, ©, w) is called 7-Abelian, in the second one weakly asymptotic 
Abelian, in the third case strongly Asymptotic Abelian and in the last one norm 
asymptotic Abelian. 


Physically speaking, asymptotic Abelianess corresponds to the fact that the non- 
commutativity of any pair of local observables is a property which dies out asymp- 
totically under the action of the dynamics on one of them. 
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Example 7.4.10 Quantum spin-chains (see Example 7.4.1) are the simplest instance 
of norm-asymptotic Abelianess with respect to lattice-translations on the quasi-local 
C* algebra A [328]. In the case of spins 1/2 at each site, lattice-translations corre- 
spond to the shift-automorphism ©, : A > A defined by 


Given A, B € A, for any £ > 0 we can find strictly local A+, Bz € Aj—g,k] such that 
|A — Ael] < £ and ||B — B,|| < £. If Nan > 2k, OF [B] € Ain—k,n+k] commutes 


with Ae, |A, en[B.]| = 0, whence 


[la szm]| = |[4— 4, ectaal] + [fas e218- 2.1] 
< 2e||BI| + 2e(e + IAI) - 


The mean magnetization m in (7.3) is not a quasi-local observable for the spatial- 
average collects contributions from all local regions: nevertheless, it exists in the 
GNS representations 7+, (A)” corresponding to the translation-invariant states w4, 4 
in (7.1) and (7.2). Furthermore, local non-commutativity is suppressed by dividing 
by larger and larger number of sites with the result that the mean magnetization 
commutes with all local observables. By its very construction it also commutes with 
the GNS unitary operator U,, which implements the space-translations on the GNS 
Hilbert space. Therefore, m € Rey = T4, , (A) NU,,(Z)’. Since m4, (A)! = {All} 
it thus follows that m is a scalar multiple of the identity in the two representations. 


The fact that the mean magnetization commutes with all local observables holds, 
more in general, as a consequence of 7-Abelianess; namely, nu [A] € R! follows 
from observing that (7.69) yields 


Ne k Ru | Tu (A)? [72(X), m(O(¥)) | Ta (B) | Qu, )| = 


= (Ru |T (A) [m(X), n 1] TB) 12s) = 0. 


Remark 7.4.6 Proposition 7.4.3 states that 7m, [A] E m,(A)” N Us(GY; if 
(A, ©, w) is n-Abelian then also ny [A] © Tu (A), whence ny [A] E Ry ARI. 
Since the latter is an Abelian von Neumann algebra, actually the center of Ry (see 
Definiton 5.3.4), it turns out that [ns [X], h iv] = 0 for all X, Y € A. Therefore, 
using Example 7.4.9, one proves that on a dense set, 
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(25 ra | Pun Pos Posto) Pus] matala) = 
=m [Qu] mo AY [Pa ToO Pur US TaY) Pa] mB) 12.) | 
= m | (21 mA" | Po ToX) Pay MOY) Po | M(B) 2) | 
= (Ru | mA)" |Pont Pos mo [Y] Pu] 10B) R) 
=m [K Ra | m6) [10:2 Por Me LY] Pas] To) 120) 


= (Qu |Tu(A)} Pa [ms [X]; mo [Y]| Po ToB) 12u) = 0, 


for all X, Y € A. Therefore, 7-Abelianess implies Abeliannes of Py mu (A) P, as a 
set (in general it is not an algebra). 


If (A, ©, w) corresponds to a classical dynamical system, then (7.70), (7.71) and 
(7.72) are equivalent statements implying (7.69). In a genuinely quantum setting, 
however, (7.71), (7.70) and (7.72) take into account the differences between con- 
vergence in the weak, strong and norm topologies, whereby, in general, (7.72)=> 
(7.71)= > (7.70)= > (7.69). Interestingly, the weakest sort of Abelianess is sufficient 
to make properties (7.60)—(7.68) equivalent to each other. Indeed, the consequences 
of 7-Abeliannes are as follows. 


Proposition 7.4.5 If (A, ©, w) is n-Abelian, then Ri, = T(A)” A Uy (GY. 


Corollary 7.4.1 If (A, @,w) is n-Abelian, the properties (7.60)(7.68) in the 
ergodic list are equivalent. 


Proof Because of 1-Abelianess, applying the previous proposition one gets that 
(7.65)= > (7.66), whence the claimed equivalence follows from 
Proposition 7.4.4. 


Then, the following definition makes sense. 


Definition 7.4.8 An 7-Abelian quantum dynamical system (A, ©, w) is ergodic if 
w is extremal invariant. 


Remark 7.4.7 In the case of Example 7.4.8.2, the eigenvectors of the Hamiltonian 
are extremal as functionals on Ma(C) and thus ergodic according to the definition 
above. However, finite-dimensional systems cannot be asymptotic Abelian; indeed, 
while condition (7.64) holds, condition (7.66) does not. 
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Proof of Proposition 7.4.5 The proof is based on 3 steps. 
e Step 1: If (A, 0, w) is 7-Abelian then Py Ru Py is Abelian. 

Indeed, from Remark 7.4.6 we know that P, t (A) P, is Abelian, thus, by con- 
tinuity, also Py 7,(A)” P.,. On the other hand, since the elements of Rọ are strong 
limits of operators of the form (7.52), it turns out that Py 7(A)” Py = Po Rw Pu, 
whence the latter is Abelian. 

e Step 2: If Po Ro Py is Abelian, then R’, is Abelian (Ri, C RY = Rw). 
Since P) E€ Ru, we can use Example 5.3.2.6 to deduce that 


Po Ro Po C (Po Ro hy) = Pek Pox 


Further, since | 2u ) is cyclic for Po Ru Py with respect to the Hilbert space P Ho, 
using Example 7.4.4.1 we conclude that Py Ru Py = Pu Ri, Pu, whence the latter 
algebra is Abelian. In order to prove the statement, we show that R’, and Py R}, Pu 
are isomorphic; for any X’ € R’, set A(X’) = P, X' Pu. This map is obviously linear 
and surjective; further, since P) € Ry, A(X'Y’) = A(X)AY’^) for all X’, Y’ € R! 
Finally, suppose A(Z’) = 0 for some Z’ € R!,, then, since P,,| 2) = | Qu), 

Z' ToX) Qu) = T(X) Z’ P| 2) SX) Pe Z' Po| Ru) =0 
on a dense set. Thus, Z’ = 0 and åA is also injective. 
e Step 3: If XEeA and mu(X) € Tu (A) N Us(GY then ny [X] = m(X) € 
Ty(AY. 
This fact can be extended by continuity to the constants of the motion in the strong 
closure 7,,(A)” N U (GY so that 

Tu lA) N OGY E TAY => TA) N UGY ER. (7.73) 


If X € m4 (A) N U,,(G)’, it commutes with the projector onto the invariant vectors 
Po € Ry, whence X Py = Py X Py implies 


(A A TAC E S atA Po. 
On the other hand, Example 7.4.9 and Proposition 7.4.3 yield 
Nw [A] = Po Tw(A) Po E TWA)" N Uu(GY , 
whence by continuity 
Ponal" Par = (Tot N UEY ) Pos (7.74) 
Finally, from the first two steps, (7.73) and (7.74), we derive 


R!, Py = PI R!, Py C Py Ro Po = Po Tu ( A)" Pe 
= (m(A)" n U.(GY') P, = (rus n U.(G)’) P. 
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Consequently, given X’ € R’, there exists Y’ € m,(A)” N U..(G)’ such that (X’ — 
Y’)| 2). As | 2,) is cyclic for Rw, it is separating for R’, thence X’ = Y’ so 
that R!, C m,(A)” or, equivalently, R’, © m,,(A)” N U,,(G)’ which, together with 
(7.73) completes the proof. 


Remark 7.4.8 In case of 7-Abelianess, from Ri, = Tu (A) N U,,(G)’ it follows 
that R’, is contained in the center Zy = ™,(A)” N Tu( A) of 7)(A)” and is thus 
Abelian. Actually, R’, coincides with the commutative algebra of @-invariant clas- 
sical observables of the quantum system (A, ©, w). Therefore, if (A, ©, w) is n- 
Abelian and w is a factor state (see Remark 5.3.2.2), then R’, = {All} and w is 
extremal invariant and thus ergodic, according to Definition 7.4.8. 

If (A, ©, w) is 7-Abelian, but the state w is not extremal invariant, that is not 
ergodic with respect to ©, then ?/,, cannot be trivial. However, it is Abelian and thus 
generated by a unique set of minimal projections P; (see Sect. 5.3.2). Each of these 


projections are such that (UÍ)! P; UŁ = P; and thus provides a @-invariant state w j 
on A: 
(2, | P; T.(X) | Qu) 

w(P}) 
Since the P; are minimal projections in R’, they cannot be further decomposed in 
R! , whence they are extremal invariant and yield a decomposition w = a j W(Pj) wj 
of w into its ergodic components. 


wj(X) i= 


Example 7.4.11 [262] Let w+ be states of a one-dimensional spin-chain of the 
form (7.4), where (the upper indices label the lattice sites, the lower ones the Pauli 
matrices) 


e (Toi) =H (22 os) T's 
l=1 l=1 
S (i a) = IT (oS = (= Lit 03 ax) _ ] [Cat Sis . 
l=1 


£=1 


Unlike the states (7.1) and (7.2) which are characterized by infinitely many spins 
pointing up, respectively down along the z axis, these states are anti-ferromagnetic 
alternating spins up, at even sites w+, at odd sites w_, and spins down. These are 
pure states and give rise to factor GNS representations 7.,(A)”: in fact, as for the 
states (7.1) and (7.2), one can show that the commutants 7. (A)’ are trivial. Also, by 
considering the shift-automorphism of Example 7.4.10, it turns out that w+ o Og = 
wz so that the state 


ess oo (7.75) 


is translation-invariant and can be decomposed in terms of pure states: 


wa (Ay = LALETA u) 
(2w | Q+ 2u) 
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with Q+ € Tu (A)'. If w could be decomposed into @,-invariant states w;, these 
would correspond to 0 < Q; € Ri, = m(A)’NU..(G)’ that provide decomposi- 
tions of the pure states w as well 


W(A) = (ww | Q+ Qitw(A) Iu ) J (wy | Q+ Cl — Qi)m(A) 2u) 
i (2.1 Ou 12.) e > 


which is impossible. Therefore w is extremal invariant, but not a factor state. As for 
the spin system considered at the beginning of this chapter, the following even and 
odd magnetizations m®° = (m}°, m3’, m¥°) (see (7.3)) exist as strong operator 
limits in the GNS representations m+ and Tw: 


N N 


‘ : H 2i o. : H 2i+1 
me ST 2 BE MT a SIN F] 2 i 


They commute with all local observables and thus belong to the trivial centers Z4 
and the non-trivial one Z,,. In the first two ones they are multiples of the identity, 
mS, = (0, 0, +u) and m$. = (0, 0, Fu), while in the representation my which can be 
conveniently split as Tw (A) = T4+(A) R T-A), 


rm = n(o a) Da a) 


Furthermore, while enjoying clustering in the mean, the representation 7,,, which is 
not a factor, is not clustering; indeed, 


m E qk —1)l1+k 
nh CD H 2a while woh = D 


wobo} 


From the previous discussion, it turns out that if (A, ©, w) is ņ-Abelian and w 
extremal invariant (property (7.60) in the ergodic list), then two-point correlation 
functions factorize in the mean (property (7.63) in the ergodic list). Concerning 
the extension of the classical property of mixing (see Proposition 2.3.3, (2.66) and 
(2.76)) to quantum dynamical systems, due to non commutativity, one distinguishes 
between various way of clustering beside (7.59). 


Definition 7.4.9 A quantum dynamical system (A, ©, w) is weakly mixing if [58] 


jim w(A@,(B)C) = w(AC)w(B) VA,B,CEA; (7.76) 


strongly mixing [58] (or hyperclustering in [257]) if 


jim w(A@:(B)CO;(D)E) = w(ACE)w(BD) VA,B,C,D,EEA. (7.77) 
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Clearly, strong mixing implies weak-mixing and weak-mixing implies 
n-Abelianess; it also implies weak asymptotic Abelianess, whereas strong-mixing 
implies strong-asymptotic Abelianess. The proof of the latter statement (the proof 
of the former one is similar) comes from (7.77) applied to 


w(A'L@,(B), CITO, (B), CJA) = 
= w((CA)' O,(BÎ B) CA) — w((CA)* ©, (B)Ý C @,(B) A) 
—w(At @,(B)' Ct @,(B) CA) + w(A' ©, (B)? CTC ©,(B) A), 


which yields 


„Jim w(A"T@,(B), CTTO, (B), CJA) = 


= w((CA) CA) w(B B) — w((CA)' CA) w(B" B) 
—w(A' CÌ CA)w(BÝ B) + w(Al C? CA) w(B' B) =0. 


Also, weak-mixing and strong-asymptotic Abelianess together imply strong-mixing; 
this can be seen by rewriting 


w (ALO; (B), CITO; (B), CJA) = 
= w((CA)' @,(B B)C A) — w((CA)* @,(BÝŶ B)C A) 
—w(A' @,(B* B) CÏ C A) + w(A' CÝ C O,(B" B) A) 
-w((cayt ©, (B*) & 0,(B) | cA) = w(at O, (B)! ie @,(B)|CA) 
+u( a! O, (B)Ý [ct C, 0,(B) | A) 


Now, weak-mixing means that the first four terms factorize in the same way and 
thus cancel each other, asymptotically in time; on the other hand, each of the terms 
with the commutators vanish asymptotically if the system is strongly asymptotic 
Abelian. This comes out from the Cauchy-Schwartz inequality (5.51) which gives 
upper bounds of the form 


l(a @,(B)' [at 0,(B)| cA) j < 
< w((CA)Ť O, (BÝ B)CA) w((CA)t ee 0,(B)) et, 0,(B) | cA) 


Proposition 7.4.6 Given a quantum dynamical system (A, ©, w), we have the fol- 
lowing implications: 


. strong-mixing (7.77) implies weak-mixing (7.76); 

. strong-mixing (7.77) implies strong asymptotic Abelianess (7.71): 

. weak-mixing (7.76) implies weak asymptotic Abelianess (7.70); 

. weak-mixing (7.76) and strong-asymptotic Abelianess (7.71) imply strong-mixing 
(7.77). 


AUNAK 
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Remark 7.4.9 If the state w is faithful then weak-mixing is equivalent to the fac- 
torization of two-point correlation functions (see (7.59)). Of course, if (A, © w) is 
weakly mixing, then w(X ©;(Y)) asymptotically split into w(X)w(Y). Vice versa, 
using the KMS conditions (7.27), if (7.59) holds, then 


lim w(A @,(B)C) = lim w(ow(i)(C) A ©: (B) 
= wW(au(i)(C) A)w(B) = w(A C)w(B) . 


The same conclusion that (7.59) implies weak-mixing follows if one knows (A, 0, w) 
to be weakly asymptotic Abelian; one uses 


w(A @;(B) C) = w(AC @,;(B)) + w(A [o.B), c]) 


The following proposition establishes a link, similar to the classical one, between 
mixing and the spectral properties of the time-evolution group U,,(G) in the GNS 
construction. 


Proposition 7.4.7 If (A, ©, w) is weakly mixing the following equivalent properties 


hold, 
w- lim m(O(X)) =X) VX EA (7.78) 
w- lim Us) = | 2u) 2a (1.79) 


Proof Weak-mixing asserts that lim 7,,(@;(X)) = w(X) Il on a dense set, whence 
t-00 


(7.78). Furthermore, from 


w(At @,(B)) = (Qy|mtw(A)! U(t ToB) |2,) and 
w(A*) w(B) = (Ru | mw(A)"| Qu ){ Qu \Tw(B) | Qu) , 


for all A, B € A, it follows that (7.78) and (7.79) are equivalent. 


Remarks 7.4.10 


1. If (A, ©, w) isnorm-asymptotic Abelian and w is a factor state, then w is clustering 
as in (7.59) [80]. Indeed, Z,, = {All} says that the C* algebra 6 generated by 
Tw (A) and 7,,(A)’ has trivial commutant B’ = {All}, namely B” = BCHL,,). Let 
X € A and consider the vector 


Hy 3 | Wx) = (m(X) — o(X) D| 2.) . 


It is orthogonal to | 24 ): thus, there exists BCH.) 3 T = TÝ such that 


T|¥x)=0, T|2,)=|2u). 
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Actually, T can be chosen in 6 [80]: the operators 
Cy := T (ToX) -—w(X)I), Co:= (I — T) ((X) — w(X) 0) 


are in B. Further, C1| Ru) = C*| 2,,) = 0 and m,(X) = Ci + Co = w(X) 1; 
consequently, 


wW(XO;[Y]) — w(X)w(Y) = (2, | Cima (OLX) 2u) 


= (Ru | [C1, mOAXD] 12). 


Now, for any € > 0 one can approximate C1 € B by a finite sum in 7,,(A) U 


Tyu (A): 


E 
ser 
IYI 


n 
|c: -Y r (X;)Y; 
= 


so that 


WXEL) -VOW < Y [N2 [rC m(A)]B) 20) + e 
j=l 


< 18) [FED mp] + e. 
j=l 


Thus, since (A, ©, w) is assumed to be norm-asymptotic Abelian, then it turns 
out to be clustering whence, according to Remark 7.4.9, also weakly mixing. 

2. If (A, ©, w) is norm-asymptotic Abelian and w is an extremal KMS state, then it 
is a factor state (see Remark 7.4.3.4) and the system is weakly mixing. 


Example 7.4.12 The infinite dimensional systems of Example 7.4.6 provide an inter- 
esting framework to apply the previous abstract considerations [60]. 

Since the @,-invariant state is tracial, as explained in Remark 7.4.9, correla- 
tion functions as those appearing in (7.76) can be reduced to two-point correlation 
functions as in (7.59). Then, (7.39) yields 


w (Wa(fIOK[Wo(g)]) = (f1 UA lg) - 


Therefore, if the underlying classical system is mixing, that is if the Koopman opera- 
tor U 4 has absolutely continuous spectrum on the subspace orthogonal to the constant 
function (namely to the GNS cyclic vector | 2u }), then, for all f, g € Di, (T3, 


„lim w(Wa(f)OAIWoC)I) = lim (f1U4 lg) 


t> 


(f*|2)(2\g) = w(Wo(f)) w(Wo(g)) . 
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This means that, independently of the deformation parameter 0, the quantum dynam- 
ical systems (Mg, ©, w) are mixing when such is the classical dynamical system 
of which they represent a quantization. 

As regards strong mixing (7.77), observe that (7.50) implies 


w ([Wwo. ©} Wom] [Won Wom] = 


= 4 sin? (2700 (n, (B)'m)) , (7.80) 


where we have set B = A’, the transposed of A. Since o(n, B‘m) is an integer, 
when 0 € Q is rational, the right hand side of (7.80) is periodic in £ and cannot 
vanish when t —> -too. Therefore, the quantum dynamical systems (M p/q, Oa, w) 
cannot be strong asymptotic Abelian, and thus not strongly mixing, because of 
Proposition 7.4.6. 

If 0 is irrational, then, as proved in [60], the right hand side of (7.80) vanishes 
asymptotically at most for a countable set of 0 € [0, 1]. A concrete construction of 
a countable set of 6 is as follows [251]. 

Let t > 0; using (2.17)-(2.19) in Example 2.1.3 with b and c exchanged, one 
explicitly computes 


a(n, B'm) =C; (m) (niaz — nzaj+) + C-(m)a™ (nyaz— — naj) 
mıađ2— — M241- = 
=a’ A (niaz — mai+) + O(a) 
t+ 
=p a mnie —a)(a—a) 
+ mnb” — mın (a7! — a) — mm (a— a)) x 
The transposed matrix B = A’ has eigenvalues a*! as A; therefore, Tr(B‘) = a! + 


a™ € Z for all t > O. It thus follows that 


att! 


az — 1 


a(n, B'm) = 


(mim c — mmn b — (mim + mnı)a 
+ mina! + myn, a) + O(a‘) 
1 2 
=z XO ratt + O(a) 
k=0 


2 


1 
3 ) r (att + a) + O(a‘), 
a2 — 

k=0 


where the coefficient rg € Z and thus also the sum is an integer. 
Choose 0 = a*s mod (1), s € Z, then 


bon, B'm) = s(a? — 1)o(n, Bm) mod (1) = O (a™*) mod (1) . 
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Therefore, when t — +00, the function in (7.80) vanishes and norm-asymptotic 


Abelianess holds. Indeed, consider Wọ( f) and Wg(g) where f and g have compact 
supports, namely, there exists K > 0 such that f(n) = g(n) = 0 when ||n|| > K; 


then, 
m)]| 


a X If] |g(m)| Cum (7.81) 


n,m 


` 


|[w. owo] < Z Iwg] |[ Woe, w 


IA 


for a suitable constant Chm. Because of the assumption on Supp( f) and Supp(g), 
(7.81) goes to 0 with t > +00. 

Thus, for 0 = sa? mod (1), s € Z, (Mg, Oa, w) are weakly mixing and norm- 
asymptotic Abelian; whence, according to Proposition 7.4.6, strongly mixing. 


7.4.4 Algebraic Quantum K-Systems 


The notion of K-systems is naturally extended by removing the Abelian constraint 
from Definition 2.3.6. 


Definition 7.4.10 Let (A, ©, w) be a quantum dynamical system; if A is a C* alge- 
bra it is called a C* algebraic quantum K-system if there exists a C* subalgebra 
Ao C A such that 


1. A; : OL Ao] C Arı for all t € Z; 
2. Viez A: = A; 
3. Nez A; = {All}, 


where /\ eZ A; denotes the set theoretic intersection of the C* subalgebras A; and 
Viez Ar the C* they generate by norm closure. The nested sequence will be called 
a quantum K-sequence. 

If A is a von Neumann algebra acting on a Hilbert space, then (A, ©, w) is 
a von Neumman algebraic K-system if there exists a quantum K-sequence of 
von Neumann subalgebras An, with V: eZ A; denoting the von Neumann algebra 
obtained by strong-operator closure. 


Remark 7.4.11 Typically [262], starting from a C* algebraic quantum K -system 
with K -sequence A+, one considers the GNS representation Tu (A) and the sequence 
of von Neumann subalgebras 7,,(A;)” and checks whether it is a (von Neumann) 
K-sequence for (7,,(A)”, O, w). 


We have seen in Sect. 2.3 that classical K-systems enjoy the strongest possible 
clustering properties corresponding to K -mixing; to some extent this notion extends 
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to von Neumann quantum K -systems. Let a quantum dynamical system (A, ©, w) 
posses a sequence {A;},<z of C* subalgebras such that, setting M := 7,,(A)” and 
Ms: := Tu (Ar)", {Mz }rez is a K -sequence for the von Neumann triplet (M, 0, w). 
Let w be a faithful state and the Mo to be invariant under the modular automor- 
phism ow, so that Proposition (7.4.1) ensures the existence of a normal conditional 
expectation Eo : M > Mo which respects the state. It thus follows that the CPU 
maps E; := ©; o Eo o O- : Mt M, are w-preserving conditional expectations. 
Setting 


P To (A)|Ru = E;lm(A)]] Ru) VAEA, 


one obtains a bounded linear operator which can be extended to a bounded operator 
P, : Ho +> H;, where H; is the closure of the linear span mu (A)| 2u ) and Ho is the 
GNS Hilbert space corresponding to w. 


Proposition 7.4.8 The P, are projectors such that P, = UO Po Us (t), where 
U„(t) is the GNS unitary operator which implements © on Hu. If {M;i}rez is a 
K-sequence, then [259] 


1. Py < Pı forallt € Z; 
2. s— lim;—+o0 P, = 1; 
3. s = lim; +o P; = | Qu X Rw |; 


Proof From the properties of the conditional expectations (see (5.46) in Proposi- 
tion 5.2.5), it follows that P? = P,. That PÍ = P, follows from the assumption that 
wo E, = w; indeed, using (5.44) and (5.45), one gets 


(m(A) Qu | PiTo(B)2u) = w(AŤ E LB) = w(E [AT EB) = w(E[AT) B) 
= (PT (A) Qu | Ta (B) Qu) , 
for all A € A (thus on a dense subset of H). Also, on a dense subset, it holds that 
P;Tu (A)| 2u) = O; o E 0 -Tu (A)| 2u) 


= US(t) EU (tm (AUL(O] Ru) 
= UL) Po US OTAN] Qu) - 


This proves the second statement of the Proposition, while, of the last assertions, the 
first one is a consequence of M; C M,+ and the last two relations follow from 


dim (m(A)" | (Pr — 1) |tu(B)) = lim w(A' (E;[B] — B)) = 0 
fim (TA | (Pr = | 2u) 2u |) [Tu(B)) = lim w(AT E,[B]) — 
—w(A)w(B) =0. 


In fact, these two limits imply weak-operator convergence of projections to pro- 
jections which is equivalent to strong-operator convergence. Notice that the second 
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limit holds since E; maps onto the trivial algebra when t > —oo and w(E;[B]) = 
w(B). 


Corollary 7.4.2 Let (M, ©, w) be a von Neumann algebraic quantum K -system as 
specified above; then, for any € > 0, Ag € Mo and A € M there exists T > 0 such 


that 
w(Ao@:LAD) = w(Ao)w(A)] < ey w(Ao Aj) 


forallt < —-T. 


Proof The result is a consequence of the second strong-operator limit in the previous 
proposition and of [259,353] 


w(Ao@,[A]) = w(Ao Po US (t)A) = w(Ao US (t) P+ A), 
whence 


w(Ao@rLA]) = w(Ao)w(A)| = |w( A0 USO (Pr = 12) 21) A)| 


< y w(Ao Ag) ICP; = | Qu X Ro |) Al Bu VIL 


Corollary 7.4.3 Let (M, ©, w) be a von Neumann algebraic quantum K -system as 
specified above; then, it is weakly-mixing. 


Proof Because of Remark 7.4.9, one need show 


lim w(A@,[B]) = (A)w(B) VA, Be M. 


Since {M;};ez is a K-sequence, let € > 0 and choose A; € Mo and s € Z large 
enough such that 


|(A = Os[Az)| 2u) Se. 
Then, for ¢ sufficiently large, 


w(AO1[B]) — w(A)w(B)| < |w((A - Os[A-)O-LBD| + 
+|w(AcO-c45 LBD — w(A\w(B)| < 22 IB] 


This shows clustering when t — —oo; when t — +00, one uses the modular auto- 
morphism to rewrite 


w(A@;[B]) = w(O[A]B) = w(ol,[B]0_sLA) , 


and then applies the previous argument. 
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1. The result in Corollary 7.4.2 is the maximum of uniformity one can achieve in 
clustering; indeed, if 


w(BO,[A]) — w(B)w(A)| < ey w(B BÝ) 
for allt < —T and B € M, then B = O,[A‘] would yield 
0 = w(Al A) — |w(A)/? = w(cat — w(A)*)(A — w(A))) . 


As w was assumed faithful, this gives A = w(A) Il for all A. 

2. By substituting Ap with ©,[Ao] for fixed s, the uniformity in Corollary 7.4.2 
holds with respect to any fixed Ms. 

3. The structure of the nested sequence of Hilbert subspaces {H; }rez correspond- 
ing to the projections P, very much resembles that arising from the Lebesgue 
spectrum of classical K-systems (see the discussion after Remark 2.3.5). How- 
ever, the projections {P;};-z have been constructed by relying on the existence 
of w-preserving conditional expectations E; : M > Mz. For a state like the tra- 
cial state which has trivial modular automorphism, they surely exist; however, 
this need not be true in general. Luckily, the previous results can also be proved 
without referring to the existence of a K-sequence of projections [262]. 


Example 7.4.13 As sketched in Example 2.3.3.4, the classical hyperbolic automor- 
phisms of the torus are K-systems. Aided by this fact we shall show that, for rational 
values of the deformation parameter 0 = p/q, their quantized versions (Mg, ©, w) 
are algebraic von Neumann quantum K -systems [60]. 

According to Example 7.4.6, we shall identify the classical automorphisms of the 
torus as triplets (Mo, OA, w), where Mo is the von Neumann Abelian algebra of 
essentially bounded functions on T?, ©; is such that Oa[f](r) = f(Ar), f € Mo, 
and w is the integration on T? with respect to the uniform measure dr . Therefore, 
the K-partition characterizing them as K-systems amounts to the existence of a 
K-sequence {M;}rez of von Neumann subalgebras of M (see Definition 2.3.6). 

Because of the decomposition (7.49), let M, C M p/q be defined, with obvious 
use of the notation, as 


Mi:= $ SaN] Wp/q(s) - (7.82) 
seJ(q) 


In this way, the K -properties of the classical K -sequence {\;},-z would make the 
characterizing properties in Definition 7.4.10 also hold for the non-commutative 
sequence {M;}1ez of subalgebras of M p/q. The first two conditions are in fact 
immediate, while the third one comes from the fact that (7.46) implies T [1] = 05.9. 
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Unfortunately, one has first to ensure that the M, are subalgebras, namely that, 
if f, g € Ni, also 


XO NLS] Sg Mel gl] W/q(s) Wp © = 
s,teJ(q) 


= X SAMs f1 Pal] Wp (Gls +t) Wojq(< 8 +t >) 
s,teJ(q) 


D d[P Tg] Wolls +| Wpjq(<s+e>) 0.83) 
s,teJ(q) 


belongs to Mz. 

To this purpose, consider the von Neumann subalgebra Mo C Mo consisting of 
those essentially bounded functions on T° that satisfy (7.45). Since Mo is mapped 
into itself by Oq, the K-properties of the sequence {M; }rez extend to the sequence 
(Nw? = NEN MO Jez, whence the triplets (Me, Oqa, w) are also K-systems.° 
Further, let B denote the (von Neumann) subalgebra generated by the characteristic 
functions x 4(s) of the partition of the torus into atoms 


gs t1 


Ag):={r: <x < ae 
q 
where sin € J (q); set B; := Oa (B), By := VA 


s=—o Bs. Finally, construct the von 
Neumann subalgebras 


Ñ, = Ni? M By ’ 

and consider the sequence {N;};ez.- = 7 

This is a K-sequence for Mo; indeed, N, C N41 directly follows from the 
analogous property of the K-sequence {NV },<z, while Vez Ni = M is a con- 
sequence of the fact that V,ez NO = Mo together with the observation that 
Mo = Mo VBC View N. Finally, Aez Mt = {X11} follows from the fact that, 
according to Proposition 2.3.5, Tail (B) = {All}. 

Let us now insert the subalgebras M; in the place of M; in (7.82); using (7.47), 
for f € N;,s € J(q), let us consider 


1 ; 
IIs f] Wo(s) = = > PIFI en 2 tist/q 
teJ(q) 


Now, it turns out that the map in (7.45) fulfils 


As As 
>? [Ox LIM = Calf (r+ =) = s (ars =) =f (ar+ < =) 


q 
= 0a [Bo]. 


3 Here, © and w denote the restrictions of the dynamics and the state to Mo. 
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Since yf? maps the subalgebra B into itself for all s € J (q), it turns out that, when 
fe N,,, all the components in (7.82) also belong to N,. Therefore, with f.g€ N,, 
it follows that 


MN, > TLF] Wo(s) Telg] Wo) = (TsLF1 Tlg] Wo(gls +4) Wo(< s +t >). 
baM 
EM cst (A) 


This shows that the linear sets in (7.83) are subalgebras of M p/q and completes 
the proof that the quantum dynamical triplets (M p/q, OA, w) are algebraic von 
Neumann quantum K-systems. 


Remark 7.4.13 The reason why all quantized hyperbolic automorphisms of the 
torus are algebraic K-systems for rational deformation parameters is that the proper- 
ties of their classical counterparts are inherited by the commutative subsystems (the 
centers) contained in (M p/q, Oa, w). When the deformation parameter is irrational, 
this is no longer true and indeed one can prove that a part from countable sets of 
0 € [0, 1] (Mo, OA, w) cannot be algebraic K-systems [60]. 


7.4.5 Quantum Spin-Chains 


As sketched in Example 7.4.1, the algebraic structure of quantum spin-chains is 
as in Definition 2.2.5, the difference being that at each site of the one-dimensional 
lattice indexed by the integers k € Z, instead of diagonal matrix algebras, there 
remain assigned copies A% of a same non-commutative algebra A. In the following, 
we shall consider A = Mq(C), namely chains consisting of linear arrays of d-level 
quantum systems (or spins). 

The algebra Az associated with the infinite chain is the quasi-local C*-algebra 
which arises by taking the norm closure of the «-algebra consisting of operators from 
all local algebras Aj—e e) := Qj» Ak = Ma(C)®@!+"), supported by the lattice 
sites in the interval [—£, £]. If Ag denotes the strictly local x algebra |_J teN Al-¢,¢] 
then Az = Ae. 

The local algebras A;_¢ ¢) = Qi ¿ Ax describe spin arrays located at finitely 
many lattice sites — < k < £. Their elements are linear combinations of tensor prod- 
ucts @i__ Ak Ak E€ Ax. If 0 < £ < p, the local algebra A;_¢,¢] can be embedded 
into A[_ p,p| as follows; we shall denote by 


Wy f= jail, Wi = 8i olt, lyy = Oye (7.84) 


the tensor products of identities at sites from i to j, from —oo to i— 1 and 
from j+ 1 to +00, respectively. Then, Aj—¢.¢] is embedded into Aj-p,p] as 
I[—p,—e-1] ® Aj- Q Uye+1,p}. Analogously, Aj—e,¢] is embedded into Ao as 
1-e-1] ® Aj—e,¢) Q l+. In the following, for sake of simplicity, we shall often 
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identify local algebras A;_—¢ ¢, with their embeddings, as well as their elements as 
elements of Ao. 
The dynamics over Az is the shift automorphism 0, : Az > Az 


Os (At-e.c)) = Aj-e+1,¢+1] 
£ — e+ 
@o( Ion ® (@—-cAx) ® e+) =I1_,® (@ftt Ae) @ lje. 


In order to complete the description of quantum spin-chains as quantum dynamical 
systems we need provide Az with translation invariant states, that is with positive 
functionals w : Az > C such that w o O, = w. Given any such state, its restrictions 
wii, j] >= wM, j] to a local subalgebras Aji, j] := Q}; Ac are density matrices 
pu, j| E Ma (C)®U-'+)_ Since it originates from the global state w, the family of 
Pti,j] iS automatically compatible with the embedding of Aji, j} C Afi, j+1), that is 
they fulfil the condition 


w (Aj 8- Aj @ Lj) = Try j+ (Pu j+Ai ® --- Aj Q Hja) 
= Try, j ((Try+pu,j+1) Ai +++ Aj) 
= Trg, ji (pu j4: ++ Aj) , 
where Try, ; indicates that the trace has to be performed with respect to an orthonor- 


mal basis of the Hilbert space (C4)®/~'+) associated with the spins at sites 
i < k < j. In other words, 


Trg+ u, jp = Pti jo VijeZ. (7.85) 
Further, translation invariance implies 


w(O,(Aj Q- Aj)) = wll; Q Aj Q- Aj) 
= Trp, j+ (Pu j+ i @ Ai @ --- Aj @ W541) 
= Try, j (Trejo, j+1) Ai @ «++ Aj) 
= w(Aj ®::: Aj) = Trt, j] (pii, j]Ai Q- Aj) , 


whence the local states satisfy 
Try p, j+) = Pli+1, j+ = Plij VLJEZ. (7.86) 
Vice versa, if Az is equipped with a family of local states py; j}, i, j € Z, satisfying 


(7.85) and (7.86), then they define a translation invariant state w on Az such that its 
restrictions to local subalgebras satisfy w}Aji, j} = pu, j} [11]. 


Definition 7.4.11 (Quantum Spin-Chains) Quantum spin-chains are dynamical sys- 
tems represented by algebraic triplets (Az, O,, w) where 
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1. Az is a quasi-local algebra with a d-level system at each site; 
2. Os : Az > Az is the shift-automorphism over Az; 
3. w: Az + C is a translation invariant state over Az. 


Example 7.4.14 Quantum spin-chains turn out to be C* algebraic quantum K- 
systems; indeed, one argues in the same way as for classical spin-chains (see 
Remark 2.3.5). The K -sequence consist of the quasi-local algebras A; := ©% (Ao) 
where Ag C Az is the quasi-local algebra which arises as the C* inductive limit of 
the local matrix algebras Atp,q]; with p < q < 0. 

If w is a factor state, (Az, O,, w) is also a von Neumann algebraic quantum K - 
system. This can be seen as follows: denote by Aj; the quasi-local algebra generated 
by all matrix algebras of the form Aqp,q] with t < p < q and set M; := m(A;)", 
Mer := Tu(Ap)”, Mi := (M,)’ for the various commutants. Clearly, the first two 
conditions in the second part of Definition 7.4.10 are satisfied. The third one is 
obtained as follows: since Mj;41 C M}, one finds [262] 


(A m) = UM, = UM; u Min = (UMi) Y (U Mima) 


teZ teZ teZ teZ teZ 
=MUM=(MAMY = XI’ 


for w has been assumed to be a factor whence the center Zy = MMM’ is trivial. 

This is not true in general; for instance, in the case of Example 7.4.11, the state 
(7.75) is not a factor and the von Neumann system is not clustering. Therefore, 
according to Corollary 7.4.3, (M,6,,w) cannot be a von Neummann algebraic 
quantum K -system. 


7.4.5.1 Ergodic Quantum Spin-Chains 
In Remark 7.4.8, we have seen that asymptotic Abelianess allows to uniquely decom- 
pose non-extremal invariant states into their ergodic components. 

As aconcrete application, consider a quantum spin-chain (Az, ©, w) whose state 
w is extremal invariant (see Definition 7.4.8) with respect to the lattice translation 
by one site, O,, but not with respect to lattice translations by £ sites, ef, for some 
£ € N. We prove the following result [74]. 


Proposition 7.4.9 Let (Az, Os, w) be an ergodic quantum spin-chain. For any £ € 
N the state w can be written as a convex decomposition 


ne-1 


1 
w=— > wj, (7.87) 
j=0 


ne — 


where n; divides £ and the w; are shift-invariant states over the spin-chain which 
are ergodic with respect to ef 
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Proof Consider the commutant (REY = Tu (LAZ) N IUS) y of the covariance alge- 
H: 

bra (see Definition 7.4.5) RE = (ru (A)U U£ 2) which is built by means of the 

C* algebra m „ (Az) and the group of unitary GNS operators U T n € N, instead of 

Us (Z). 

Since Az is norm-asymptotic Abelian and w is assumed not to be @!-ergodic, 
because of Corollary 7.4.1, it turns out that RE # {All}. Let {Q;}iez be a decompo- 
sition of the identity by orthogonal projections in (REY ; then, the cardinality of 7, 
ne, must fulfil ng < £. 

Indeed, let P denote any of the Q;,i € J, and set P; := UL P (UŻI, 0O<j< 
£ — 1. Since P commutes with my (Az), one derives 


Tu(X)P; = Uinu (O/[X)P (UJI = UÍ P 1 (OI XDI = Pim(X) , 


for all X € Az. Moreover, P; commutes with U£, whence Pj; € (REY for all 
0< j < £-— 1 and so does P := Vi P;, namely the smallest one among the pro- 
jections Q such that Q > P; foralO < j <£- 1. 

By decomposing n € Z as n = m£ +r, 0 <r < é—1, it follows that 


Us (Pag UD" = (P; - 
Therefore, P is ©,-invariant, whence P = Il as w is ergodic with respect to @,- 
ergodicity of w. Then, (7.61) in the ergodic list yields 


ng—l1 
1=(2|P|2)< XO (2 | Pj |2) = €(Q|P|2). 
j=0 


Applying this argument to each of the Q j, one obtains 


_ | ne 
1=(2| J) Qi12)>—. 


icl 


Since A is norm-asymptotic Abelian, from the second step in the proof of Proposi- 
tion 7.4.5, one deduces that (REY is Abelian. Let then Qj, 0 < j < ne — 1 < £, be 
its minimal projections (see Example 5.3.4.1) with Qo such that 


qo := (2 | Qo|2) < qj = (2| Q; |2) 
for all j > 0. Then, introduce the set 
So = [jez : Ui Qo (UŻY = Qo} . 
It follows that So 2 £ Z; further, set ko := min{0 < j € So}. Then, £ œ ko; other- 


wise, £ = p kọ + q, for some p > 0 and 0 < q < ko, so that q € So thus contra- 
dicting the minimality of ko. 
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Furthermore, set Qo, j := UÍ Qo (U})j, 0 < j < ke; Qo,; belongs to (REY. If 
it is not a minimal projector Qj,;), then, Qo,; = }_; Qi thus contradicting qo < qj 
when j > 0. Thus, Qo,j = Qio If Qo := VG Q; = yea Q ; (the projectors 
are now orthogonal), it follows that, as shown before, Oo E (REY andhence Qp = 1. 
Consequently, because of the uniqueness of the orthogonal decomposition of the 


identity in an Abelian algebra, it turns out that kọ = ng whence Q; = U! Qo (U 1y 
and go = qj = n". 
One can now introduce the states on Az defined by 


Az > X > wj (X) := n (2 | Qj TX) I2 ) = (2 | Oot (07%) I2) 
= wo(O![X]) , 


for all X € Az. It turns out that the w;’s are all @£-ergodic, otherwise it would 


be possible to further convexly decompose them (hence w) into ©O£-invariant 
components: 


ne—1 A 


ji 
wj= ) AjiWji, W= ) ` wi. 
; ; — Ne 
i j=0 i 


As explained in Remark 7.4.5.2, the decomposers w j; correspond to projectors Pj; € 


ey Mia ? 
(R) such that = ( 2 | Pj; |2 ) and 
ne ` 
Ajiwji (X) = ne (Q | Pim, (X) |2 ) 
< wj(X) = ne( R | PjTu(X) |2), YX € Az. 


As the projections P;; and P; belong to the commutant Tu ( Az)’, by choosing X = 
Y'Z one gets 


(2 |T)? Pji TZ) I2) < (2 | mo(¥)' Pj noZ) 2). 
Since Y and Z are arbitrary elements of Az and the vectors 7,,(X)| @) are dense 


in the GNS Hilbert space, it turns out that Pj; < Pj. But the P;’s are minimal 
projections, thus Pj; = Pj. 


7.4.5.2 Finitely Correlated States 
An interesting class of translation-invariant states on a quantum spin-chain Az is 
constructed as follows [135]. Let (5, p, E) be an auxiliary triplet, where 


1. Bisa finite dimensional algebra that we shall fix to be the algebra Mp (C) of b x b 
matrices acting on C?; 
2. pis a state on B identified by a density matrix: p(B) = Tr(p B). 
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3. E : A > B is a completely positive map such that 
(4 ® lg) = lg (7.88) 
po ELA 8 B) = p(B), (7.89) 


where 1 4, g denote the identities of the algebras A, respectively B. 


Since they result from iteratively composing CPU maps, the following maps 


0) t= bo (id4 ® N ») : A” œ> B,n>z1, EO :=E, (7.90) 


are also CPU . Consequently, the functionals p o E™ on AS” are positive and nor- 
malized. They are thus states on the local algebras A®”: moreover, the corresponding 
density matrices in Mg(C)®" Q) Mp (C) can be obtained by duality: in fact, 


Tre(p AD B]) = Tr4eg (Flel A® B) (7.91) 


where F : S(B) +> S(A® B) is the trace-preserving dual map of E which trans- 
forms states over B into states over A @ B. Analogously, to the CPU maps E” 
there correspond the dual maps F” : S(B) +> S(A®"t) @ B) given by 


F™ := (idyen @F)OF™) -n>1, FO:=F. 


Consider the states wi—¢,e} defined on the local subalgebras A,_¢,¢} by 


we (Oke Ar) = Tre (PE [(@f__ Ax) ® 1g]) 


= Tre(F""[p] @f14v @ lg). (7.92) 


As a consequence of (7.88) and (7.89), they satisfy the compatibility relations 
(7.85), that is wp—e—1,¢4+1] At—e,e] = wy—e,¢], and the translation-invariance condi- 
tions (7.86), namely w—¢,¢] = w,—e+1,e+1]. We illustrate these properties by means 
of the simplest non trivial case and choose Aq ,2]; then 


w(A @ 12) = Trg( PELA 8 ELA @ 1g]]) = Trs(p ELA @ 18)) = wA) 


w(ll; @ A) = Tre(p [Ly @ E[A @ Ill) = Trg(p [A @ Il) = w(A). 


Therefore, the family of local states wr_¢,¢) defines a global invariant state overt the 
quantum spin-chain Az. 
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Definition 7.4.12 (Finitely Correlated States) Given a triplet (B, p, E) as specified 
before, all functionals w on Az locally defined on Aj;, j] by 


w(®j_; Ax) = Trp (o AG Digi Atl) , 
are translation invariant states called finitely correlated (FCS). 


Remark 7.4.14 The specification finitely correlated refers to the finite dimensional- 
ity of the auxiliary algebra 6. Without such a restriction, every translation-invariant 
state over Az would be given as in the previous definition. Indeed, one could then 
choose B := Ali,+oo], P := w Mio,+00] and as E the natural embedding of any Ajj, j] 
into Ar +00]. 


Because of translation-invariance, w Mti, j] = wMi, j-it1; therefore, the local 
structure of w is determined by the density matrices prj | corresponding to w}Ajq nj. 
They are recursively obtained by means of the dual maps (7.92), 


pun = Tee (FP Ipl) . (7.93) 


In order to take a closer look at the recursive structure of FCS, we make use of 
the Kraus-Stinespring representation (5.210); concretely, 


XAQ B)=J_ V] A@BYV;, > Vi Vj; =I (7.94) 
jet jeJ 
vi: Cec’, v:oe, (7.95) 


where J is an index set of finite cardinality. With |74),i = 1, d and |p2),k = 1, 2, b 
two ONBs in C4, respectively C?, the action of V j can be represented in the following 
two ways, 


vile) = oy, A) ®ve) (7.96) 
k=1 
d 

vily = olde) l YP), (1.97) 


l=1 


where the wa ik? we ie € C% are in general neither orthogonal nor normalised. From 
(7.96) it follows thai 


b 


b 
=> MA, OUP MUPI, Vi = >> dP vA, @ PL. 


i,k= i,k=1 


j 
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Thus, )? jc) ViVi; = Il whence }` ey Yee | we) = pq. On the other 
hand, from (7.91) one gets 


b 
=> Wio) Yi MA 1 @ bE | 


jeJ ik=1 
p.q=l 


b 
=}, 3 D re Mfg Pfu | @1VE UE |, (7.98) 


jeJ €,p=li,q=1 


where we have co venealy chosen the eigenprojections of p as ONB in C?, that is 
p= we rel we X hae As condition (7.89) amounts to Tr AF[p] = p, the vectors 
a must also aisy DS Da irel (WA, | vi, tq) = Sig Ta- 

By recursively inserting (7.98) into (7. 93), one gets the following expression for 
the local density matrices pļ1,n], 


ptm = >. > rele yopi |, (1.99) 


jP er g. p= 1 
Jy o 
[Yip = pa [Yi ci OIE nin) Dl Vong) @ 
iMDVegrD 
A 
'® E4 Fa toin—2in-1 ) ® | Ye . (7.100) 


Remark 7.4.15 Notice that, despite the recursive structure involving more and more 
factor oponents; for each n-tuplej™ = jijo- jn € I 7 there are at most b? vec- 
€ (C4 ) 8n, 


j” 
tors v p 


s(n s(n) 
Example 7.4.15 The vectors w E need not be normalized, || Wip || Æ 1; taking this 
fact into account, (7.99) provides the following natural decomposition of the local 
restrictions of FCS states, 


n) (n) 4 reve” 2 o) 
n n p iU 
pum = >, PG”) Pam > Pim = 2 oy 0109 
jen te= Y rey |? 
£, p=1 
a 
pg”) 


pi . ji” j” n) 
where P, p Projects onto | p /MI% p ||. It follows that the support of each o Ln] 


has dimension at most b?. By defining the completely positive (non-unital) maps 
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A®BSAQIPY E,[A® B]:= Vi A Q BV; € B, the weights pg”), j = 
jijz+++ jn € Tj, can be rewritten as 


b 


š y(n) n 
PG) = P rept” P = T( opm Ej 0 Ej o+ Ep Uge ® Il) « 
t=1 


Since Dja ij = E, from (7.88) and (7.89) it follows that the probabilities re) = 
{pg} joer define a shift invariant global state w,, over the Abelian algebra of gen- 


erated by tensor products of infinitely many card(J) x card(J) diagonal matrices, 
thus a classical spin-chain eae Og, Wr). 


As regards the action of V; in (7.97), we proceed as follows. Given the vectors 
|W?) ie € CÈ, where J€J,i=1,2,b and £= 1,2,d, let Vie € M,(C) be the 
matrix such that (¥P | vi, (WP) = (YË | WP). Then, 


d b 

Vj = OD (v2) @ vjel oF») (uF | (7.102) 
€=1 i=1 
d b 

Vi= P(e (UP lvje) - (7.103) 
l=1 i=1 


Then, (7.88) implies 1g = >> 


ViVi; =Ib=DVies ys vjevi. Further, the 
dual map F reads 


Jed 


d 
Flol= >> >> 1b (dA @vt, pvjg - (7.104) 
JEJ p.g=l 
It then turns out that, in terms of the vjes, the translation-invariant condition (7.89) 


amounts to ae J DE vi ¿ P Yje = p. Finally, using (7.93), the local density matri- 
ces pji,n] exhibit the following recursive structure, 


Pil,n] = >» | Win )« Wi) | Talo agi py) j (7.105) 


jm el 
KM) 1) <Q 


Ay. A A A i 
where | Vin ):= | Vig )®| Vig )@--: | Dy, ) and Vwi = Vii, V jais °° * Vinin- The 
above expression is particularly suited to deal with 


Definition 7.4.13 (Purely Generated FCS) A FCS w is called purely generated if 
the defining CPU E consists of only one Kraus operator [135]: 


(A@B)=VIA@BV. 
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In terms of (7.102) and (7.103), the map E and its dual F read 


a 


UAD BJ= Y\(WAAWA) vi But, Flo] = >> dA) (wAl@ vf pw, 


i,k=1 i,k=1 


whereby compatibility (7.88) and translation-invariance (7.89) impose 
a . a 
X` vivi =ip, X vj pu =p. (7.106) 
i=1 i=l 


The states p Age := F[p] on A & B and pj, = Trg(idy ® F o F[p]) on Api,2) 
can be explicitly written out. Notice that, because of translation-invariance, 1,2] 
describes any two nearest neighbor spins: 


vipu vps 


a ‘ 
paoB = > IAA l@upvi=[ oo. o (7.107) 
i,j=1 so axe 3 
vapo? Va pug 
Riij1 Riija 
a . wee 4S 
pua= >> éy o = + |, (7.108) 
i,j=l ; eae ` 
Raiji Raija 


where Rijek := Tr(vj vi p vevk). 


Example 7.4.16 (AKLT Model) A typical instance of the recursive finitely correlated 
structure is provided by the AKLT-model [5,6], a spin-chain consisting of spin 1 
particles with nearest-neighbor interactions described by the Hamiltonian 


NI ç] 1 2 1 
H = -S-S -= ( S-S —?, 7.109 
2i k kt +z (S t1) +3] ( ) 


where S = (Sik, S2k, S3ķ) represents the spin operator for the k-th spin along the 
chain. 

The possible values of the total spin of two nearest neighbors are 0, 1 and 2 with 
corresponding orthogonal projectors PP, PP , respectively PP. Therefore, since 


2 
Se Sipi = -2PP — PP + PL, (Sk Se) =4PP +PP +PP 


and PP + PP + PpP = 1, it follows that the interaction between sites k and k + 1 


amounts to the projection PP onto the subspace with total spin 2. 
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__ The spin 1 at site k can be described by means of two spins 1/2, labeled by k, 
k, by projecting with Pg from C+ onto the 3-dimensional subspace orthogonal to 


the singlet state wo) of the pair of spins 1/2 at k and k. Further, after associating 


the spins 1 at k, k + 1 with the pairs k, k, respectively k + 1, k + 1 of spins 1 /2, 
one imposes valence bonds between the pairs, by requiring that the spins 1/2 at k 
and k + 1 be ina singlet state ra Ri i It follows that the common state of the pairs 


k, k and k + 1, k + 1, namely of two neighboring spins 1 is eigenstate of PP with 
eigenvalue 0. 

Further, by appending two spins 1/2 at the opposite ends, 0 and N + 1, of the 
spin 1 chain, it thus follows that the vector state 


(eps! Pears IMA) @ AD?) @ WA) @ LW?) (7.110) 


is the unique ground state for the Hamiltonian 


2 
H= yA 4 F (its Sı) + Sàl tsn- Sw) (7.111) 


which is obtained from 7.109 by adding two boundary interactions involving the 
boundary spin 1/2 operators sg and sy+1. In the limit of an infinitely long spin- 
chain, the above valence-bond construction provides a unique, translation-invariant 
ground state of the AKLT-model, known as valence-bond solid, which exhibits 
short-range correlations and an energy gap. 

In the limit of an infinite spin-chain, its ground state, the valence-bond solid, 
corresponds to the triplet (5, p, E) with B = M2, p(B) = +Tr(B) and 


z: M3 Q M2 > AQ@BRV'(A@B)VEM, (7.112) 


where, with |b; 2) € C? the eigenvectors of the Pauli matrix g3 relative to the eigen- 
values 1, —1 and |a1,2,3) the eigenvectors of S, relative to the eigenvalues —1, 0, 1, 


2 
|a2, b2} — ii bı). 
(1.113) 


1 
V|bi) = = 45 et bi) — Vy bi), V|b2) = ——= 
V3 
From (7.102), with o4 := (01 £i07)/2, it thus follows that 


= 2 a Se (7.114) 
vi = 3 +> v2 = wa v3 = gan : 


One can thus check that the conditions (7.106) are satisfied and, moreover, that 
the identity matrix 12 € M2 is the only solution of the second relation in (7.106) 
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in agreement with the translation invariance and purity of the valence bond solid. 
Further, from (7.107) one explicitly computes 


00 00 00 
1-os- 0 \ ilo ooo 
PAB = = | V2o} 1 vV2oe- |=- , (7.115) 
ONG wo ae 00 0 1 v20 
si a 00 0/210 
00 0000 
while (7.108) gives the nearest neighbor states 
000000000 
01 0-100000 
00 2 0-10 000 
, [Oe 100000 
pu27=~|00-10 10-100 (7.116) 
°100 000 1 0-10 
00 0 0-10 2 00 
000 00-10 10 
000000000 


Remark 7.4.16 Finitely correlated states are a useful arena for investigating the 
behavior of entanglement in quantum spin-chains for these are determined by the 
triplet (B, p, E) (see [53,246]). Furthermore, they are particular important as ground 
states of certain solid state Hamiltonians whereby one is interested in either the rela- 
tions between entanglement and long-range order effects [270] or in the possibility 
to create entanglement between distant sites by suitable local measurements [286]. 


7.4.5.3 Price-Powers Shifts 
The so-called Price-Powers shift [29 1,292] are quantum dynamical systems described 
by an infinite-dimensional C* algebra Ag, whose building blocks are the identity 
operator Il and operators e;, i = N, satisfying 
2 _ 

e=1. (7.117) 
Their algebraic properties are determined by a function 

9g: Nor {0,1}, gO) =0, (7.118) 


called bitstream: according to its values, different e;’s commute or anticommute 


eve; = (UP eei, Vi,jEN. (7.119) 
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By means of the above relations, every product of finitely many e;’s can be reduced 
(up to a sign) to an operator of the form 


Wi = eiei +++ Gi, o (7.120) 


where i = i1i2---i, € N* stands for any choice of finitely many indices such that 
iy < i2 << iy: i will be called the support of W;. 

It is convenient for later purposes to explicitly compute the commutator of two 
operators W; and W; with supports i = iji2---in andj = ji j2+++ jm: 


[Wi, Wj] = Wi Wy (1 — (Er Eam edad) (7.121) 


By means of the operators W; one constructs finite-dimensional local subalgebras. 
Indeed, again by means of (7.117) and (7.119), products of W;, with i’s consisting 
of indices from a same interval [p, q4], p < q, reduce up to a sign to some other W; 
from the same interval. Thus, the algebra A,,),,) generated by W; with i from [p, q] 
is a finite-dimensional unital C* algebra that can be embedded into the spin algebra 
M>(C)®4-P+!, This becomes apparent by representing the operators e j by means 
of tensor products of Pauli matrices: 


j—1 


ej = RED 8 x); 8 ij. (7.122) 
i=l 
Since 0,0, = —0x@;, one can check that the relations (7.119) are indeed satisfied. 
The local C* algebras generate the x-algebra 
A, = U Alin] > 
n>1 


and by norm closure (for instance as a subalgebra of the quantum spin chain 
So M>(C)) the quasi-local algebra 


A =A". (1.123) 


As for quantum spin-chains the dynamics on Ay is given by the shift to the right of 
the support of the operators W;: 


Wi > OLIW;] =: Wise = ei, 42€inte + Cinte » teN. (7.124) 


Proposition 7.4.10 Let Wg := 1l; the linear functional w : Ag +> C obtained by 
setting 
w(Wi) = ôi g (7.125) 


and by linearly extending it to A, defines a tracial O,-invariant state on Ag. This is 
the only tracial state on A, if and only if the following property holds: for all finite 
supports i = iji? -- -in € N* there exists k € N such that a g(|k — ie|) is odd. 
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Proof The positivity of w can be checked by setting W = >>, ciWi, ci € C and 
considering 


wW? W) = > cf gow] Wj). 
ij 

The expectations w(Wj W;) vanish unless w; W; = I which can be true only if each 
ei, in W; is matched by an e;, in W;. Thus, w(W; W) = ig whence w(WtW) > 
0. Therefore, w is positive, thus continuous (see (5.51)) and can be extended by 
continuity to the whole of Ag: That w o O, = w follows directly from (7.124) and 
(7.125). Such a state has the tracial property w(W; W;) = w(W; Wi). 

Suppose that a state @ on A, has the tracial property and that for all finite supports 
i = i,i2---i, € N* there exists k such that wa g(|k — ie|) is odd, then [11] using 
(7.117) and (7.119), one gets 


(Wi) = Bleg Wi) = Bex Wi ex) 
= (—)dta1 lk-d Be? Wi) = (Wi) . 
Therefore, &(W;) = 0 for all W; 4 Il whence © = w. 
Viceversa, if there exists a support i = iji2---i, such that for all k € N yey 


g(\k — ig|) is even, then (7.121) implies that W; commutes with A, and thus with 
Ag, whence, setting W := Il + W; (v(W) = 1)), it turns out that 


Ag > Wr OW) :=wWW), WeeA,, 


defines another state on A, with the tracial property. 


Remark 7.4.17 The property that ensures the uniqueness of the tracial state w 
defined by (7.125) is guaranteed by non-periodic bitstreams [264]; we shall assume 
this in the following. 


Definition 7.4.14 (Price-Powers Shifts) We shall call Price-Powers shifts the 
dynamical triplets (As. Os, w) constructed as above with a unique invariant tra- 


cial state w. 


Examples 7.4.17 ([14]) 


1. The von Neumann algebras M, := Tu ( Ag)” that arise from the strong closure 
of A, in the GNS construction based on w are hyperfinite. By extending © and 
w to M, one gets von Neumann triplets (M,, ©, w) with w still a unique tracial 
state. 

2. If g = 0 then (My, Oc, w) is an algebraic version of the classical balanced two- 
valued Bernoulli shift (22, To, 4). The von Neumann algebra Mo is generated 


ei 


by the projections p; := 


which are orthogonal for a same index i and 


otherwise commute. 
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3. If g Æ 0, because of the existence of a unique normalized trace, all M g are hyper- 
finite factors of type 77. Indeed, their center Zg := My N M, = Ø, otherwise 
w(Z Wi) 


there would be a positive Z € Z, with w(Z) > 0 such that wz(W;) := (Z) 
i w 


gives a different tracial state on Mg contradicting the uniqueness of w: 
w(Z WW)  o(W ZW) _ w(Z WW) 

WZ) (Z) — (Z) 
wz(W; Wi). 


wz(Wi Wj) 


4. If g(1) = 1 then A, amounts to a discrete Fermi algebra endowed with an infinite 
temperature state. Indeed, ejej + eje; = 0 if i A j so that the operators 


2-1 +i ei + 21-1 — iei 
ai i= ————.,, a) = ———__ , 
2 í 2 
i > 1, satisfy the CAR (5.64). Moreover, the expectations 
5; 
7 ij 
w(a; aj) = > 


are those of a Fermionic KMS state at infinite temperature (see Example (7.4.3)). 

5. The bitstream can be chosen such that the von Neumann dynamical triplets 
(Ag, Oc, w) are asymptotically highly anti-commutative [264]. This means the 
following: there exists a subset S C A, such that 1) the set ll U S is dense in Ag 
and 2) for any S € S,£ > Oand N € N there exist0 < nı < no < ---< ny EN 
such that the anti-commutators satisfy 


[lors oF] =: 


for alln; Æ nj. In this case the tracial state w turns out to be the only state which is 
N 
1 
invariant under the shift O,. Indeed, choose S$ € S and set X := — > on Ts], 
N 
i=l 
then 


IA 


|x* x 4 xx'| Z isi + m Dherr e"tsi}| 
tAj 
E(N — 1) 
-Jy ` 
Further, if v is a translation-invariant state on Ag, then v(X) = v(S); thus, by 
applying (5.51), 


IA 


2 
Í JSI? 
N| I" + 


(Xt X + WRT 
2 


MOI = WOOL < (VTE + VX xD) <j 


2 = 
o [IS N-D 
~ N 2N 
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Because of the arbitrariness of N and € > 0, it follows that v(S) = 0 for all $ € S 
and because of the assumed density of the set Il U S, v coincides with w as defined 
in (7.125). 


Like in Example 7.4.12, Price-Powers shifts are weakly mixing for all bitstreams; 
indeed, because of the quasi-local structure of the algebra and the of the tracial 
property of w, one need only study the asymptotic behavior of 


w(WiO>o(Wj]) = w(Wi Wize) - 
Clearly, for sufficiently large t, i N G + t) = Ø, then 


lim w(W,O,[Wi]) =0, 
t—+00 


unless i = j = Ø. 
As regards strong-mixing, it is convenient to consider strong-asymptotic 
Abelianess first; namely, at its simplest, using (7.121), it turns out that 
yp este 
w (lei, ej+rl'lei, ej+]) = (1 = (pit) 
Therefore, unlike for weak-asymptotic Abelianess, the possibility of strong- 
asymptotic Abelianess depend on the asymptotic behavior of the bitstream; for 


instance, highly anti-commutative Price-Powers shifts cannot be strongly asymp- 
totic Abelian. 


7.5 Von Neumann Entropy Rate 


As seen in the introduction to this chapter, the usual setting of quantum statistical 
mechanics consists of a quasi-local algebra A which is the C* inductive limit of local 
C*-algebras Ay C B(Hy) of operators localized in finite volumes V C R3; also, A 
is equipped with a locally normal state, namely with a state whose local restriction 
to Ay, w}Ay is a density matrix py € Bi (Hy). Usually, w is translation-invariant, 
that is py+q = py, Where V + a denotes the volume V rigidly translated by a € R? 
(or by a € Z? in the case of a lattice system). 

Consider two disjoint volumes V; and V2 and let V := Vı U V2; then, Ay = 
Ay, ® Ay, and PVi2 = Try, , pv, namely the states localized within Vı,2 are 
obtained as marginal states of py localized within the larger volume V. Each local 
state py has von Neumann entropy 


S(V) := S (py) = —Tr(py log py) ; 


then, the subadditivity of the von Neumann entropy, that is the upper bound in (5.171) 
reads 


S(V) < S(V1) + S(V2) (7.126) 
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where the equality holds if and only if py = py, ® py,. In order to understand the 
meaning of strong subadditivity in this setting, consider two volumes V and U such 
that V2 := VOU Æ Ø and set Vj := V \ Vo, V3:= U\ Vo, W =V UVU V3 = 
U U V. Since Vj,2,3 are disjoint volumes, it follows that Aw = Ay, 8 Ay, @ Ay, 
Ay = Ay, ® Ay, and Ay = Ay, ® Ay,; further 


Pv, = Truy oliv, (Pw), pv = Truy, (pw), pu = Try, (ew) - 


Then (5.172) reads 
S(U U V) + S(V2) < SU) + S(V). (7.127) 


In general, the von Neumann entropy of py diverges when V fî R? (or V t ZP); 
on thus wonders whether the rate S(V)/|V| exists when the V + R?, Z3, where 
[Vl = fy dr. Among the many ways a sequence of volumes may fill the whole 
space R? (or Z°), a convenient one [372] is to consider a family of parallelepipeds 
V (a) := {x = (x1, X2, X3) € R: —4i < Xi < ail, where a € R3 and then to let 


each a; > +0 so that V (a) > R?. 


Proposition 7.5.1 (Mean von Neumann Entropy [372]) If (A, w) is a quasi-local 
shift-dynamical system with a locally normal translation invariant state w, its mean 
von Neumann entropy is given by 


SVa) SV) 


s(w):= lim _—— = inf ——. 
VaR |V@| Va) |V@| 


(7.128) 
Proof ((372]) Because of translation invariance, in (7.128) we can consider paral- 
lelepipeds of the form V (a) = {x ER s0<a = ai}. Choose £ > 0 and a paral- 


lelepiped V (ag) in such a way that 


S(V SV 
ae (V(@)) 7 (V (ao)) w) 
v@ |V(a)| |V (ao)| 
By decomposing R; 5 a; = nid, + b; withn; € NandO < b; < aù, any other V (a) 
can be written as the union of disjoint parallelepipeds 


Va)= |] Vela) U Vp(ao) 
O<kı <n] —1 
O<k9 <ng-1 
O0<k3 <n3-1 


Vi(ao) = [x ER? : kia < xi < (ki + Dah] 


Vp (ao) := {x cR: niab <xj < nih + bi . 
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Then, (7.126) yields 


3 
S(V(@)) < (i) S(V(ao)) + S(Vb(ao))  (**). 


i=l 


IV (@)| 


By translation, Vp (ag) can be embedded within V (aq); furthermore, since 0 < b; < 
aù, each of them is the intersection of the interval [0, aĵ] with an interval [—c;, ai, — 
ci], ci = 0, of the same length. It thus follows that Vp(ao) can be written as the 
intersection V2 of V := V (ao) with a suitably translated V (ap) denoted by U. Then, 
from (7.127) and translation-invariance one derives the upper bound 


S(Vp(ao)) = S(V2) < S(U U V) + S(V2) < S(U) + S(V) = 2 S(V (ao)) . 


Finally, dividing (**) by V (a) and going to the limit, similarly as in the proof of the 
existence of the Shannon entropy rate in (3.2), using (*) one gets 
S(V@) _ S(V (ao) S(V@)) 


lim sup < < s(w) +e < liminf 
va) IV@| |V (ao)| va) |V(a)| 


’ 


whence the result follows from the arbitrariness of € > 0. 


Examples 7.5.1 1. For quantum spin-chains (Az, @,,w), the mean entropy is 
given by 


: 1 at 
s(w) = nim 75 (pt1.n1) = inf z5 (Pti,m) ; (7.129) 


where /1,n] is the density matrix corresponding to the restriction w [Aj ,n] of the 
translation invariant state w to the local subalgebra Aj, ,n}. 

2. Because of their structure (see Remark 7.4.15), purely generated FCS w have 
s(w) = 0. Indeed, the support of local states pjı,n] is at most b*-dimensional 
where M, (C) = B is the auxiliary algebra in the triplet (6, p, E); then S (pu, n1) < 
2log, b. 

3. Consider the Bosonic (7.22) and Fermionic (7.20) quasi-free states w4 and assume 
the action of the operator A on Li, (R3) to be given by 


(r1 Ay) = Í. dx Ka(r— xU) , 
where the kernel K 4 has Fourier transform 


P 1 f 
Kalk) := oo | dx e ** Kax) 
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such that 0 < K(k) <1 for Fermions, 0 < K(k) < M < + for Bosons. 
These quasi-free states are translation-invariant and their mean entropies can be 
explicitly calculated [132,276], 


S(wa) = z dk (Ka 00) + nl - Ra) (Fermions) 
1 Ja pa 
son = Gos f, ak (Rat) ~ n+ Rak) Bosons). 
Remarks 7.5.1 


1. The entropy density of quantum spin-chains scales as the power of the shift- 
automorphism, that is the entropy production per length £ time-step is £ times the 
entropy production per unit time-step: 


1 
se(w) := lim =s (a™) = ls(w), (7.130) 


where pi) = w]Ao.ne—1. Indeed, since the limit in (7.129) exists, it can be com- 
puted as 
sw) = lim >S (pti) = > 5x) 
n>+oo nt i L 
2. From (5.166), it follows that the entropy density is affine over all convex decom- 


positions of ©,-invariant states w of quantum spin-chains into ©, -invariant com- 
ponents wj. Namely, if w = Dar Ajwj, with O < A; <1, ar Aj = 1, then 


sw) => Aswj), VEEN. (7.131) 
j 


3. In the case of a decomposition of a translation-invariant state w of a quantum 
spin-chain into 6° -invariant components wj, the previous two points give 


sew) = Y Ajsej) Y Aw), VEEN. (7.132) 
j j 


Let us consider the decomposition of a ©,-invariant state w over a quantum spin- 
chain which is not @* ergodic into ne @* ergodic states (see Proposition 7.4.9). 


Lemma 7.5.1 Given the decomposition w = + new l w;, using the notation o 
p ne j=0 7J & 


Remark 7.5.1, it turns out that 


1. all states wj have the same entropy density with respect to O£: Se(wj) = se (w), 
0< j <ne— 1. 
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2. Set oa i= is (9), s® := is (p) and fix n > 0; it turns out that the subsets 


of states 
Ares fos : sO > sw) + n} (7.133) 


has asymptotically zero density. Namely, if #(A¢,,) denotes its cardinality, then 


#(A 
lim em _ 9 


ne>oo ne 


(7.134) 


Proof Part 1 Because of (7.132), se(wj) = Se(wo), for 1 < j < ne — 1. This fact 
follows from subadditivity (5.171) and the fact that 


j Be 
p? ) = w0 O57 Mio,ne-1] = wo Mi-j,ne-j-1] - 
a 
ap es 


Indeed, split the intervals [— j, nl — j — 1], O < j < ng — 1 into disjoint pieces 
(notice that, according to Proposition 7.4.9, ng < £), 


[-j, nl — j —1]=[-j, 8 — 1]U[£, ng —2-1]U[ne — £, nl — j— 1]; 
Ce a E 


l h B 


) HUB 


then, apply (5.171) to the density matrices p” , respectively py ` ® Ba and use 
translation-invariance together with the bound (5.165). It then follows 


s (0%) <s (e +5 (ag) <s (of?) + 2€log, d. 


Vice versa, if instead of subdividing the interval [~ j, n£ — j — 1] of interest, we 
include it as a disjoint piece in a larger one 


[—¢,né+€—1] = [-4, -1]U[—j, ne — j —1]U[né— j,ne€4+2-1], 
eee 


l h ih 


then, subadditivity and boundedness give 


5 (0) > a) = an > 5 Taa — 2logd . 


Dividing by n and taking the limit n —> oo yield the result. 
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#(A 
Part 2 If there were no such that lim sup #( Aen) 


ne—oo ng 


#(Ag;.m) ej 
a subsequence £; such that lim — ey =a, Then, since pi = > pg , sub- 
` jroo ng; z0 


=a > 0, then there would be 


ne.—1 
ej 


additivity implies 


i ne;—l ne,—l 

ne, (€;) (€;) 

£ £ 

ng, 8 = S an A a = Sg” 
J J k=0 k=0 


x (€;) 
(At; mp) GO) HN) + AAG, gy) min sg”. 


Ljano 


IV 


The previous point, (7.130) and (7.129) obtain 


1 ; f 
Lj s(w) = se; (wj) = inf — S (07) <0; a whence 
AG; ng) 


S(w). 


) . #(Ae;.no) 
sO) > EX (s(w) +m) + 


J tj 


When n tł; > œ,a contradiction arises: 


s(w) > (s(w) + n)a + s(w)(1 — a) > s(w) . 


7.6 Quantum Spin-Chains as Quantum Sources 


Quantum spin-chains (Az, O,,w), with A = Ma(C), provide useful algebraic 
descriptions of quantum sources whose signals consist of quantum states acting on 
Hilbert spaces of increasing dimension. The local states p obtained as restrictions 
of w to the local subalgebras A”) := Aji,nj describe ensembles of quantum strings 
of length n emitted by these sources. 

Quantum sources are one of the two ends of quantum transmission channels; 
like their classical counterparts, these consist of a source, a sender who encodes, a 
channel which transmits and a receiver which decodes. Channel inputs and outputs 
are generic quantum states and the encoding and decoding procedures, as well as 
the channel action are quantum operations described by trace-preserving CP maps 
on the state-space. 

In analogy with Fig. 2.2, a quantum transmission scheme can be pictorially rep- 
resented as in Fig. 7.1. 
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Fig. 7.1 Quantum 
transmission channel 
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1. At each stroke of time, a source A emits quantum states, represented by density 
matrices p; € i (H), i = 1,2,a, H=C*%, with weights p(i). The statistical 
description of a single use of the source is given by means of the density matrix 
P= Xi- PO pi. 

2. As a result of n uses of the source, the sender would collect generic density 
matrices Bo € BY (H®"),i = ijiz -in € RIP ,with weights p™ G). Con- 
sequently, the statistics of n uses of the source is described by the density matrix 


pO SS POC: (7.135) 
MeQnm 


which embodies purely classical correlations (the weights) and quantum correla- 
; (n) 
tions due to the states Pion: 


3. The encoding is a trace-preserving CP map €” : Bi H8”) > Bi (Kk) such 


that E™ [o] ~ e are density matrices that can all be considered as acting on 


a same (finite dimensional) Hilbert space KY. 


4. The code-states a 


a trace-preserving CP map F”) : i (Ke) we i(K®) such that F [oie] = 


go through the (lossless) channel that transform them as 


Fo , the latter being a, possibly not-normalized, positive matrix acting on a (finite 


dimensional) Hilbert space Ke. 
5. The channel output orn finally undergoes a decompressing procedure corre- 
sponding to the action of a CP map D” : B 1K“) > i (H®”) such that 
~(n) ~(n) 
DOE] = vee 
6. The efficiency of the encoding-decoding procedures with respect to the channel 


action F is measured by how faithfully the decompressed states re reproduce 


the input states bee: 
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The simplest instance of quantum source is the generalization of a classi- 
cal Bernoulli process: at each use of the source, vector states | Y; ) € H := ce. 
i = 1,2,a, (not necessarily orthogonal) are independently emitted with weights 
p(i). The quantum statistics of n uses of the source is thus described by the density 
matrix 


P= SoH X PPO Dea eal, 0136) 
j=] 


Mea” j=l 
where p = Ð; p(i)| di )( Yi | and pG) = JT pCi). 
Remarks 7.6.1 


1. Quantum spin-chains as they appear in quantum statistical mechanics provide 
fairly general models of quantum sources. Their local states over n successive 
chain sites correspond to density matrices p™ that describe a variety of possible 
quantum strings of length n consisting of separable and entangled states that can 
in turn be pure and mixed. 

2. Like classical strings, quantum strings emitted from quantum sources of Bernoulli 
type can be chained together by tensorizing them; this is not anymore so obvious 
for generic quantum strings [76]. 

3. Two classical strings can always be told apart, for instance by a non-zero value 
of the Hamming distance that counts by how many symbols they differ. Instead, 
there are uncountably many quantum strings that can be arbitrarily close to one 
another, for instance with respect to the trace-distance (6.78), and which cannot 
then be perfectly distinguished. 


7.6.1 Quantum Compression Theorems 


In analogy with classical coding, the idea how to compress quantum information in 
absence of noise is to consider quantum strings acting on Hilbert spaces H®” with n 
large and to map them into quantum strings acting on Hilbert spaces H) of smaller 
dimension in a way that allows for faithful decompression. 

Concretely, the procedure consists in a coding operation corresponding to a trace- 
preserving CPcompression map €” : i H8”) > i (H™) and a decoding oper- 
ation described by a trace-preserving CP decompression map D” : i H”) > 
i (H®”) that tries to retrieve the source signals. If the task is to compress the infor- 
mation contained in n uses of a quantum source, then each of the quantum strings in 
(7.136)) is subjected to the chain of maps 


o po = EM PM] > pM = Do]. (7.137) 


i”) i” 


Any sequence {€), D},, will be referred to as a compression scheme and denoted 
by (E, D). 
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In the following, we shall first focus on quantum Bernoulli sources emitting qubits 
, that is we shall consider Hilbert spaces H®” = C?” and local algebras A™ = 
Mhn (C). For them, the compression rate of a scheme (C, D) is defined as follows. 


Definition 7.6.1 (Compression Rate) The compression rate of (£, D) for a qubit 
quantum source (Az, w) is given by 


1 
R(E) := lim sup = logs dim(H™) . 


n— +00 


(n) 
i" 


where Hi) is the minimal support subspace of all quantum code-words o 

According to the previous definition, for large n, 2” * (©) estimates the dimension 
of the subspace supporting the encoded signals, with R(E) roughly being the used 
number of qubits per encoded qubit . Clearly, one looks for compression schemes 
(E, D) such that R(E) < 1 withD™ o €™ asymptotically approximating the iden- 
tity map in a suitable topology. 


7.6.1.1 Compression of qubit Bernoulli Sources 
In the case of a Bernoulli quantum source, Shannon’s noiseless coding Theorem 3.2.2 
has a natural quantum extension whereby the von Neumann entropy plays the role 
of the Shannon entropy as optimal compression rate. 

A convenient fidelity is the ensemble fidelity introduced in Definition 6.3.6; using 
(6.83) it reads: 


Meas” 
It is positive, bounded by 1 and equal to 1 if and only if = = o ee Useful upper 
and lower bounds to F,, are obtained as follows. 
If p= yi rilrprple S(C’) is the spectral decomposition of the state 


describing a single use of the source, the eigenvalues re) of p8” are of the form 


7) = IIi rj, Let H™ c HS” be the smallest subspace, of dimension d (n), sup- 


j® 
porting all code-words os and let 7 : H8” ++ H™ denote the corresponding 


orthogonal projection. Then, 


Kal ena tomes”) =e > pe tp rm) 
mea” 
d(n) 


< \oej(o™). (7.139) 
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Indeed, the first inequality is implied by the fact that Die < I’, while the second 
one is the Ky Fan inequality (5.168), where e j (p®"), j =1,2,...,d(n), are the first 
d(n) largest eigenvalues of p8”. 

Vice versa, given ”) : HS” ++ H™, consider the trace-preserving CP map 
E™ : Bt (H®”) & B7 (H™) defined by 


EMP =r prM™+ Y 10) Glo P)O], (7.140) 
| Oy) LK” 


|0)(0| (a) ) 


where | 0) € H™ is a suitable reference state. As a decompression map D™ , choose 
the identity map on HI which embeds it into H®”. Then, 


oy =F pO) r + (0)(01Tr((— P) pe), 


i” 


whence, since pl a isa pure state, 


2 
Fav( aR |, D D” o ge Z > 2 Te( (o pe) ) 


men” 
2 
T (Tl r) > J Aw (2Tr(o! m ro) - 1) 
MeQ” im 
> 2Tr (pe r™) =e (7.141) 


Exactly as the Shannon entropy in the classical case, a theorem of Schumacher 
[191,314] shows that, for quantum sources of Bernoulli type, the von Neumann 
entropy S (p) is the optimal compression rate. Namely, this rate can be achieved 
by suitable compression and decompression schemes with high-fidelity retrieval of 
increasingly long qubit strings; on the other hand, compression and decompression 
schemes with rates exceeding S (p) perform poorly with long qubit strings. 


Theorem 7.6.1 (Schumacher Theorem) Let (Az, p®©) be a qubit Bernoulli source 
with entropy density S(p). If R > S(p) there exists a compression scheme (E, D) with 
rate R(E) = R and ensemble fidelity Fay tending to 1. On the contrary, if R < S(p), 
then for every compression scheme such that R(E) = R the ensemble fidelity tends 
to 0 in the limit n — oo. 


Proof The eigenvalues ri) of p8” provide a probability distribution 7” = 


{r roly egm On the strings j™ e€ RP with Shannon entropy n S(p). According 
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to Proposition 3.2.2, for any ô > 0, € > 0 and n large enough, there exists a subset 
A” of probability 


Prob(4™) = > rf = Te( o°" pe) saa 


je Al” 
and cardinality d(n) satisfying 
aie 82r S29 <d(n) < gn(S(p)+e) 


et roject onto the subspace linearly spanne: the eigenvectors |r = 
Let FP proj he subspace linearly spanned by the eig zh 
Iri) @ |rj,)+++@|r;, ) corresponding to the eigenvalues ri) „j € A”, Such 


T™ can be used to construct the compression map (7.140): hence, from (7.141), 
Fay > 1 — 26. Also, the bounds on d (n) ensures that any rate R < S(p) is achievable. 

Vice versa, if d(n) < 2”S)-®), then, according to Theorem 3.2.2, given the 
probability distribution nm”) = oe hoe g» any subset Ban) with d (n) strings has 
vanishingly small probability, Í ? 


Prob(Bain)) = Yo rn) = w” Tin) <e 
JEBam) 


for n large enough, where Tgn) projects onto the subset spanned by the eigenvectors 
relative to the eigenvalues indexed by j™ e€ Bain). It then follows that also the sum 
of the first d (n) largest eigenvalues of p8” must be smaller than € and so also Fay < € 
because of (7.139). 


Example 7.6.1 ({191]) In a single use, a Bernoulli qubit source emits the non- 
orthogonal states 


Ibo) == V1 —e/0) + Vel1), |di) = V1 —e|0) — Vel 1) 


where 0 < £ < 1/2, with probability 1/2 each; the corresponding statistical ensem- 
ble is described by 


1 1 
p= sido yol + slei vil =G—e)/0)(0] el. 


Suppose that, given the 3-qubit strings | Viinis ) = | Yi, ) 8 | Yin ) 8 | Yi, ), only 
two qubits can be transmitted; how can the transmission of quantum information be 
optimized? 

Since (0| Yo) = (O|¥1) = V1 —e, (1]Yo) =—(1|~1) =e and € < 1/2, 
the sender may encode and decode each 3-qubit string by tracing over the third 
qubit and appending the high probability state | 0 )( 0 | in its place: 


| Pirinis H Piniais | Beda, = DP o EPL Paras Viii I 
= | Pirin X Pirin | @ 10)(0] . 
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The average fidelity then results 


1 
1 ~(3 
FO = g ye ( Pir inis esos Wit iris ) =l-e. 


i1,12,13 


A better strategy arises from considering the components of a 3-qubit string along 
the eigenvectors of p®?: 


[(000 | piii |= (L — £)? 110| hiii | = eV1 — € 
KOOLI Wiii | = A= EVE [(101] Yini )| =eVT—€ 
[(010 | Viii dl = A- VE’ KOL | Yini l = eT — € 
[(100 Yini Y= (VE O Viii Jl = 27/7 


Since ¢ < 1/2, the eigenvectors |000), |001), |010) and |100) provide higher 
probabilities than the second four eigenvectors; let P project onto the linear span. 
Observe that the unitary permutation 


1000) > |000) | 111) 1001) 
y : | 1001) +> [010) |110) +> 1011) 
` } 1010) + |100} |101) = |101) ’ 
|100) > |110) |O11) + |111) 


is such that U PU* = 112 @|0)(O|, where 112 = y, | ij )(ij | is the identity 
matrix of the first two qubits . Therefore, one can construct a compression map as 
follows; first, introduce the trace-preserving CP maps 


Mo] = PpP + |000) (000 | Tr( (1 — P)p 


eg 


U Elp] UÝ = tz 810)(0|(UT pU) M12 810)(0] 


+| 000 )( 000 | Tr( (1 = P)p) i 


Then, define E? ; ; (Œ) > ; (CÊ?) as EP Ip] = ew ‘[p] UÏ] and Dy? : 
+C?) & BY (C3) as DY Io] = Ut o @ |0)(0| U. It follows that 


DË o€P Ip] = PpP + |000 }( 000 | Tr( (1 = P)p) l 


Thus, with p?) = DÊ o EPL wisinis X Vani Il 


Piiizi3 


(Pirigis | PE) Wiii) = KViiis | P Wiii YP 
+ | (000 | Piisi 1? (izing | Al — P) Wiii ) 
=1—9e + 15e4 — 969 + 2°. 
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Fig.7.2 F — FO against bas 
O<e<l 


-0.05 


As the right end side of the last inequality is the same for all 3-qubit strings considered, 
this is also the value of the fidelity FË . As shown in the Fig. 7.2, the latter turns out 
to be larger than F for0 < e < 1/2. 


7.6.1.2 Compression of Ergodic Quantum Sources 

As showed in Sect. 3.2.1, Proposition 3.2.2, Theorems 3.2.1 and 3.2.2 establish the 
role the entropy rate as the optimal compression rate of classical ergodic sources. 
Theorem 7.6.1 assigns the same role to the von Neumann entropy in the case of 
quantum sources of Bernoulli type. 

For this particular family of quantum chains, the von Neumann entropy coincides 
with their entropy density as defined in Sect. 7.5; it is thus expected that a kind of 
general Quantum Shannon-Mc Millan-Breiman Theorem should hold for generic 
ergodic quantum sources. The relevance for quantum information of quantum spin- 
chains endowed with states with a more general structure than a tensor product has 
been emphasized in Sect. 7.6, where analogies and differences between classical and 
quantum contexts have also been outlined. 

In particular, in Remark 7.6.1.2 it was pointed out that, unlike for classical bit 
strings, one cannot profit from any natural “chaining together” qubit-strings. Though 
ideas how to circumvent such a problem have been put forward [76], this fact rep- 
resents an obstruction to a full quantum generalization of the classical Breiman 
theorem. The latter is an almost everywhere statement regarding single sequences, 
while the Shannon-Mc Millan formulation is concerned with statistical ensembles; 
of this theorem there exist a number of extensions to particular non-commutative 
settings [115,203,265,283] and a full quantum extension [74]. This general result 
has then been used [75] to devise compression protocols for ergodic sources consist- 
ing of encoding and decoding procedures similarly to what outlined in the previous 
section. 


Theorem 7.6.2 Let (Az, Oc, w), with A = Mq(C) as site-algebras, be an ergodic 
quantum spin-chain with mean entropy s(w). Then, for all 6 > O there is Ng € N 
such that for all n > Ng there exists an orthogonal projection pyn(d) € An such that 


1. w(pa(6)) = Tra(o™ pa(®)) = 1-6, 
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2. for all minimal projections 0 Æ py € An dominated by py(d) (p < pn(ô)) 
oe 2G O9) < w(pn(ô)) p 9-n(sw)-d) 


3. gis (w)—) < Tr, (Pn (5)) < gin(s(w)+d)_ 


That the above results extend Proposition 3.2.2 is apparent: classical high proba- 
bility subsets are replaced by orthogonal projections p,(6) whose statistical weight 
with respect to the translation-invariant state w is nearly 1; further, the typical sub- 
sets correspond to orthogonal projections whose normalization (dimension of the 
associated Hilbert subspaces), Tr(pn(5)) goes as 2°“) for large n. 

Like in the case of a Bernoulli quantum source, the proof of Theorem 7.6.2 
hinges upon considering the discrete probability distributions nO = rO a 
sisting of the eigenvalues of the density matrices p® describing the restrictions 
w to the local subalgebra Ap = Mg(C)®* with spectral decomposition p = 

d£ 
Xr, ® | a x a |. The Shannon entropy H (7) equals the von Neumann entropy 


con- 


i=l 
S (op); further, from the definition of entropy density s(w), given 7 > 0, for 
infinitely many £ one has 


1 1 
s(w) = inf -S (0) < zs (p eis FH) < <s(w) +7. (7.142) 


For Bernoulli quantum sources, the products of eigenvalues of single site density 
matrices provide a natural Bernoulli stochastic process, whose entropy density s (w) 
is exactly the von Neumann entropy of p. Such a structure is missing in the case of 
generic ergodic quantum source. However, from (7.142), one observes that choos- 
ing £ large enough, S (p O) x €s(w). Moreover, the eigenvectors r° )« | are 
minimal projections generating a maximally Abelian subalgebra D C Ag and the 
eigenvalues r® define a probability 7 over the symbols i € J := {1, 2, d°}. By 
tensorizing copies of the Abelian subalgebra D, one can embed the Abelian subal- 
gebras D, := D®" into the local algebras Ang and C*-induction yields a quasi-local 
Abelian algebra D° embedded into the quantum spin-chain Az. 

The Abelian algebra D° is clearly associated to a triplet, or symbolic model 


(2 I Tos p) (see Definition 2.2.5 and the preceding discussion) where Qi is the 


space of sequences of symbols from 7, T, is the shift along these sequences and 
p i is the measure on 2 z that arises from 7. Further, from (7.130), the (classical) 
entropy rate is 


. 1 n 
Kis Jim —s (0! 2) = (w) = £s(w) . (7.143) 


As the automorphism over D% corresponding to the shift Ty on 2 is not Os, but 
its €-th power ef , the Abelian spin-chain associated to the symbolic model of above is 
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(37, O£, w 07) (see Definition 2.2.5 and the preceding discussion). The state w is 


O©gçs-ergodic, but not in general O£-ergodic. If it were @!-ergodic, (oF ef, wD?) 
would amount to an ergodic process and we could use the classical techniques as 
in Proposition 3.2.2 with the mean entropy h(wJDf°) = €s(w) in the place of the 
Shannon entropy as follows from the classical Shannon-Mc Millan-Breiman result 
in Theorem 3.2.1. 


Remarks 7.6.2 


1. The ergodicity of the embedded Abelian spin-chain (27, ef, wD?) would 


follow from that of the quantum spin-chain (Az, ef, w), since otherwise the 


resulting decomposition of wÐ? into ergodic components would provide a 
decomposition of w as well. 

2. In [283], the quantum Shannon-Mc Millan theorem was proved under the assump- 
tion of oL -ergodicity of the spin-chain state w: such a property is known as 
complete ergodicity. This restriction has been removed in [74]. 


The possible lack of © ergodicity can be overcome by means of Proposition 7.4.9 
and of Lemma 7.5.1. Indeed, the argument of above can be developed for the o£- 


ergodic components wj indexed by j € AG , for which s¢(w;) = £s (w) and = 
is (P) < s(w) +1, for some fixed 7 > 0. For each of these wj, one considers 


j 
the local states p over Ag, the probability distributions ne corresponding to their 


spectra, the Abelian subalgebras D/ generated by their spectral projections p 
i € Ij, and the associated ergodic Abelian spin-chains (oF ef, Wj DP). Because 


of the bound (3.7) in Remark 3.1.1.1 and of the choice of indices j € AS . these 
chains have entropy rates 


hy < H@) =S (P) <L(s(w) +7). (7.144) 


After identifying strings i” of symbols from 7 j with minimal projections p;o) € 
D}, C Ane, so that ie = Trjo,ne—1) (Pp? Piw), one can choose positive £, ô and 
select subsets of minimal projections, 


Ce [piw ED): 210+ < Trjo,ne-11(P”® pj) < geria] (1.145) 
such that, by using Proposition 3.2.2, Theorem 3.2.1 and (7.144), for n large enough 


#CM) = Troe n@w) <2 a EAD (7.146) 


E 
mene + Trjone—11Pj) 21-7, (1.147) 
Pion €C” 
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where p” := D (n) Piw. 
J Pin) eC j i 
In order to use these arguments and conclude the proof of the quantum Shannon- 
Mc Millan theorem, some further results are needed. The first one deals with discrete 
subsets equipped with (not necessarily compatible) probability distributions and with 
the asymptotic behavior of their minimal cardinality. 


Lemma 7.6.1 Let D > 0 and e ma) } x be a countable family of finite sets 
ne 
I, of cardinality #(1„) with associated probability distributions T, = {pn (i)}ie n. 


Suppose — log, #(In) < D for all n and define 
n 
Qen (Tn) := min { log, #(2) QCh, m1- e} ; (7.148) 
If | (n. Tn) | satisfies 
neN 


1 1 
lim —H(tm)=h<oo(1) and limsup—aeyn(t™m) <h (2), 
non n>oo N 


for alle € (0, 1), then 


1 
lim -aen(Tn) <h, Vee (0,1). (7.149) 


n>œ n 


Proof Let 6 > 0 be arbitrarily chosen and distinguish the following disjoint subsets 
of In: 


fie ty: mCi) > 2-0-9 
Gos |; Ely: OH < m@ < r : 


i€In : m@ < gaen] l 


Suppose | > lim sup, _,.5 Tn a; (6)) = b > 0; then, there exists n such that m, Ge (ô) 
> b and Ta (1} (5) U 17 (5)) < 1 — b. Choose 0 < e < b; if mn (2) > 1 — e, 


1=eE<m(2)<1-b+ m(2 n RO) implies 
b-ez m (2 n rO) 2 #(2 n RO) 2-nh+3 and 
log, #(2 n RO) > log(b— €) + n(h +6), 


1 
whence lim —a@z.n(tm) > h + ô contradicting the second condition in the state- 
n>n 


ment of the lemma. Thus, lim m, OA (6) = 0. 
n> o0 
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It also follows that 7 a (ô) cannot asymptotically contribute to the Shannon entropy 
H (rn); indeed, applying inequality (2.85) to the (non-normalized) distributions 
Tn In 6) 


{Pr Ohera and J Gn (i) := FEO , which are such that }7<73(5) Pn) = 
n 


|, €13 (9) 
vier) qn(i), yields 


1 1 
in = —— J, pnli)logy pn) S—— D7) Pali) logy an(i) 


icl (ô) ie13(6) 
l l 

=-- SP O(log mO) = log #0) 
icl (ô) 


IA 


1 
—m, (I? (5) logy Tn (IO) + D Ta (13(6)) . 


The right hand side of the last inequality goes to 0 with n —> oo due to m, (J 2 (6) 0 
with n — oo and because log, #(In) < nD by assumption. Further, this very same 
fact implies limy- +o Tn ee (6) = 0 for all 6 > 0, otherwise 


1 1 : no 1l ; ; 
-H (tm) =—— $. PmOlog pO- — D> pn) logy Pali) — ha 
n n n 
ieI} (6) ie12(6) 
< Tn (I (5)) (h — 5) + Ta CRC) + ha 
<h + (Ta O) — TZO) + hy 
would contradict the first condition of the lemma for sufficiently small ô. 
Consequently, lim 7, (7 (ô)) = 1 for sufficiently small 6. Thus, choosing n so 
n—>o0 
that mn ($2) > 1 — € and m,(12(5)) > 1 — n, it follows that m(2 n 20) sis 
€ — n, whence 


#(2 n rO) >(1—e—7)2""-® implies 


1 1 
z Cen (Tn) 2z z 220 —e-n)+h-— ô. 


Since 6 can be chosen arbitrarily small, the result follows. 


Returning to the probability distribution 7 associated with the ordered spectrum 
of p, fix e € (0, 1) and set 


k 
Neg min 1 skad iy esis e} , (7.150) 
i=l 
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so that œs ¢(m) = log, Ne e. If 


1 
lim sup pet <s(w), (7.151) 


loo 


then, together with (7.142), this allows using Lemma 7.6.1. As follows: let I be the 


set of indices labeling the eigenprojections | #0 )« rô |. Then, choose 0 < 6’ < 6 


and consider the subset 7 7 (6’) as constructed in the lemma and set 


Pe) = DO OP 


iel?(5') 
It turns out that, for £ sufficiently large, 


Tr(o” P) = P rP =n RO 21-5. 


i 
iel? (8) 


Further, every minimal projection p < P(ô) dominated by Pr(d) projects onto a 
vector of (C7)®°, 


p=leol, y= do alr), YS laP=t, 


iel?) iel?) 


whence, by the definition of the subset 7 75 ) 


=e) a2 L(s(w)+0) < Tr(p® p) <2 £(s(w)—6') < 2 L(sw)—8) © 
Finally, from Tr(p Pe(8)) > 1 — 6 and 


HOEA <r PO) = Yo rf? saa7ey2z eo 
i¢l}(6') 


it follows that (1 — 6) 2°°)-9 < Tr(Pe(5)) = #U7(0)) < 26+, thus con- 
cluding the proof of Theorem 7.6.2. 

Of course, it remains to be showed that (7.151) really holds true. The proof of 
this fact hinges upon a per se interesting result concerning the minimal dimension of 
the so-called high probability subspaces. Practically speaking, these are the relevant 
subspaces: as already seen in the case of Bernoulli quantum sources and as it will be 
showed at the end of this section, they allow for quantum compression with reliable 
retrieval. 
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Definition 7.6.2 (Typical Subspaces) 


1. Given a quantum spin-chain (Az, w), projectors pn € A, such that w(p,) = 
Tr(p™ pn) = 1 — will be termed !-typical projectors and !-typical subspaces 
the subspaces of (C“)®” onto which they project. 

2. For any € > 0, let 


Ben(W) i= min log, Trg): Anaq =q" =a, Trq) =1- e} ; 
(1.152) 


The following result relates the spectrum of local states to the dimension of high 
probability subspaces [74,171,172]. 


Lemma 7.6.2 6- n(w) equals Nc n in (7.150). 


Proof By definition, 


Nen Nen 
a Se MP |) SO” E he ete 
i=l 


i=l 


whence Be n(w) < Nen. If the inequality is strict, then there exists a projection 
q € An such that m := Tr(q) < N: n and Tr(p™ q) > 1 — £. Then, using Ky Fan 
inequality (5.168), a contradiction emerges: 


m 


m 
1—e<Trp gq) = > (qil lq) Yr” <1-e, 
i=l 


i=l 


where | qi )( qi | are minimal projections such that q = 7, | qi )(qi |. 


For showing (7.151), the key result is 


Lemma 7.6.3 For an ergodic quantum spin-chain (Az, Oz, w) 


1 
lim sup — e n(w) < sw), Vee (0,1). 
n 


noo 


Before proving it, observe that the previous two lemmas imply a quantum coun- 
terpart to the AEP (see Theorem 3.2.2). 


Proposition 7.6.1 (Quantum AEP (QAEP)) Let (Az, Oc, w) be an ergodic quan- 
tum source with entropy rate s(w). Then, for every0 < € < 1, 


(jim. L fen (w) = sw). (7.153) 
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Remark 7.6.1 Operatively, the previous proposition states that any sequence of 
typical projections must project onto subspaces whose dimension goes as 2”°™), 
asymptotically. 


Proof of Proposition 7.6.3 Lemma 7.5.1 ensures that for any £ > 0 and fixed > 0 
there exists L € N such that £ > L implies 


0< #(Ag.n) < E , #(AG, j = #(Ag,y) =] : 


ne 2 ne ne 


Consider the ef. -ergodic components w; of w, the sets C) @) in (7.146) and the smallest 


(nt. 


projector g = = Vieas, p? ? larger than all p¢” JE AS, Afm=nt+r,0<r< 


£, set qm := qr) ® ilps and, by means of ARN, estimate 


ne—l 


Trjo,m—1(0 qm) = 7 ~~ Tr omit" Gm) 
ta 
1 ng—l1 
t 
= A7 Trjo,ne— O qi”) 
t 
j=0 
e—l 
iX #(AS E 
> — $ Toner”? pP) = —— 40- =) = 1-e. 
ne 420 ne 2 


Then, definition (7.152) and (7.146) imply 


Bem(w) < logs Trio,m—1](4m) = logs Trio,ne-1\(q") + r log, d 


< logy ( > Trion- (P) + r log, d 
JEA? n 
< log, #(A$ „) + nEw) +n) +.) + rlogd , 


whence the result follows from the arbitrariness of 7 and ô and from 


1 ô 
lim sup = “Be, m(W) < ne ee — log, H(A m + slw) +yn+-. 


m—>oo £ 


7.6.1.3 Universal Quantum Compression 

Based on the classical construction of universal codes [201,386], of which a particular 
instance has been given in Sect. 3.2.1, one may disengage the compression from 
its explicit dependence on the quantum source statistics by resorting to Universal 
Quantum Compression Schemes [196]. 
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In the following, the construction in [196] will be slightly modified. We shall 
consider a quantum spin-chain Az with an ergodic translation-invariant state w, an 
increasing sequence of local subalgebras A, as defined in Sect. 7.6 and local states 
wMn described by density matrices p™. 

The idea is to construct the analogous of the typical subsets A” as in the proof of 
Proposition 3.2.3, with cardinality growing as 2”* and probability nP (A) tending 
to 1 with n for all sources A (Bernoulli in that case) with entropy (rate) H(A) < R. 
As always when passing from classical to quantum sources typical subsets will be 
replaced by typical subspaces and by the associated orthogonal projectors. 


Theorem 7.6.3 (Universal Typical Subspaces) Let s > 0 and £ > 0. There exists a 
sequence of projectors o% € An, n € N, such that for n large enough 


o E (1.154) 


and for every ergodic quantum state w on Az, with entropy rate s(w) < s it holds 
that 


imuno = Trp 0%) = 1. (7.155) 


(n) in the above theorem will be called 


Definition 7.6.3 The orthogonal projectors Q5 ,¿ 
universal typical projectors at level s. 


We subdivide the proof of Theorem 7.6.3 in various steps. 
Step 1. Let£ € Nand R > 0. Any Abelian quasi-local subalgebra C?° C Az con- 
structed from a maximal Abelian £—block subalgebra Ce C Ag, together with the 
probability distribution w]C?° corresponds to a classical ergodic stochastic process. 
The results in [201] imply that, independently of the latter, there exists a univer- 
sal sequence of projectors (corresponding to classical universal typical subspaces) 
p“? E€ ce C Aen with 


1 
-logTr(p}) < R, suchthat lim r”(p") = 1 
n : noo , 


for any ergodic state 7 on the Abelian algebra C?° with entropy rate s(7) < R. Notice 
that ergodicity and entropy rate of 7 are defined with respect to the shift on C?°, which 
corresponds to the £-shift on Az. 

One then applies unitary operators of the factorized form US”, with U € Ag 
unitary, to the re and introduces the projectors 


ies V Urea, (7.156) 
UcA, unitary 


442 7 Quantum Mechanics of Infinite Degrees of Freedom 


These are, by definition, the smallest projectors such that, for all U, 


@n (1) 77*® (én) 
U "Po RU "WR . 


Let p” = Jal i) y(i ne | be a spectral decomposition of pe (with Z CN 

some index set), anid! jet P(V) denote the orthogonal projector onto a given subspace 
(én) : 

V. Then, w; g can also be written as 


wee = =P (span{ ue" i, ne ):ieT,UeAg unitary} ) : 


It proves convenient to consider the projectors 
wir) := P (span{ A" i; iM) :te1,Ae A:}) , wh) < we. (7.157) 
Given m = n£ + k withn € N and k € {0,..., £ — 1}, let 


wi" ; wee @1% e Ay Wee := Wee @1% e An 


These are projectors and, as in [192], one estimates the trace of wR E€ Am as 
follows. By an argument similar to that used in the proof of Lemma 3.2.2, the 
dimension of the symmetric subspace SY M (Aen := span{ AS” : A € Ae} is upper 
bounded by (n + 1), thus 


tw=n anto Trp, am a, 
(7.158) 
Step 2. Consider a stationary ergodic state w on the spin-chain Az with entropy 
rate s(w) < s. Let £, ô > 0. If £ is chosen large enough, then the projectors w, 
where R := £(s + 5), are ô—typical for w i.e. 


Tr (w) 21-8, 


for m € N sufficiently large. This follows from the result in Proposition 7.4.9 con- 
cerning the convex decomposition of the ergodic state w into k(€) < £ states wt”, 


| @ 


~ kD 4 
i=l 
entropy rate (with respect to the €—shift) equal to £ s (w). 

Moreover, according to Lemma 7.5.1, for every A > 0, if one defines the 
set of integers Aya := {i € {1,...,k(Q}: Sw) > Ls) + AD}, then these 
states enjoy the following property with respect to the von Neumann entropy: 

#(Aga) _ 

im 

loo k£) 


wi, (o , that are ergodic with respect to the £—shift on Az and have an 
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Let Ci be the maximal Abelian subalgebra of Ag generated by the one- 


dimensional eigenprojectors of the density matrices corresponding to w E€ Ae. 
The restriction of w;,¢ to the Abelian quasi-local algebra C>> generated i Cig is 
again an ergodic state. From the properties of the entropy density and of the von 
Neumann entropy one derives the chain of bounds 


L- slw) = sW) < sw; CO) < Sw CiD) = SW). 


Further, with A := 7 — s(w), ifi € Aj , one has the upper bound S$ wh) < R. 


Let U; € Ag be a unitary operator such that U8” pou € ch. For every 
ie Ala it holds that 


wl (w le x a ux8") es (7.159) 


We can thus fix an £ € M large enough to fulfill * a a) >1-5 ô and use the ergodic 


decomposition to obtain the lower bound 


1 ô 

£ £ £ £ £ 

w™ (wl ( a) > 5 > w ( ro ( Be > (1 ), min yg ae 
icA’ A A 


Then (7.159) yields 
w (WED) > ow Ral. 


Step 3. One can now proceed as in [196] and introduce a sequence of integers £m, 
m € N, where each £n is a power of 2 fulfilling the inequality 


lmn? m < m < Um? | (7.160) 


Let the integer sequence nm and the real-valued sequence Rm be defined by nm := 
Lz], respectively Rm := lm - (s + 5) and set 


(mmm) s 3-Lm 
Qo™ = : Woy. Rin ifm = m2, (7.161) 
SE yin Q id2"—'n"™) otherwise . 
Observe that 
1 m] 1 1 
— log Tr Q@) < log Tra”) < og(Mm + 1) 
m Nmťm m Nm Em Nm 
4" 6lm +2 E 1 
sE 
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where the second inequality follows from (7.158) and the last one from the bounds 
On Nm 
234m = 1 < m Aa 1 < Nm < m < 26lm+1 : 
m m 


Thus, for large m, it holds 
1 (m) 
logTrO 2 <S +e. (7.162) 
m E 


By the special choice (7.160) of £m it is ensured that the sequence of projectors 
eu € Am is indeed typical for any quantum state w with entropy rate s(w) < s. 
This means that to) meN is a sequence of universal typical projectors at level s. 


7.6.2 Quantum Capacities 


The y-quantity in Holevo’s bound (6.45) limits the amount of classical informa- 
tion that can be retrieved by a POVM measurement from encoding classical sym- 
bols i € I4 = {1, 2, a} by quantum (mixed) states, i +> p; coming from a mixture 
p= Dai a Pipi € B ; (H) with a priori probabilities p;. In particular, the bound is 
in general hardly reachable; however, like in classical capacity theory, the amount 
of retrievable classical information per transmitted quantum state can be made arbi- 
trarily close to the Holevo bound by means of suitable encodings of longer and 


longer strings i™ = ijin-++in € re := I, x ---I,4. Also, the Holevo bound is a 
e 


e 
oe , , . n times ; 
limit to the classical information per letter that can be encoded into quantum states 


and retrieved with negligible errors. As we shall see, this state of affairs will lead to 
different definitions of guantum capacities. 
In order to prepare the ground for a detailed discussion, appropriate notations 
must be introduced. 
We spectralize p = Lien r(œ)lr(œ))lr(a) |, set I5 := 1, x ++- Ip, denote 
— 


n times 
a = ajaz+++An, Qi € Ip, and write 
n n 
r(a®™) := [[r@ , fa®™):= ®&)Ir(@;)) (7.163) 
i=l j=l 
n 
pa) = pi, DnD Pin PE) = | [ p; vay 


j=l 


p= XO PE pG X raaa]. (7.165) 


imer” ael 
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On the other hand, the spectral decompositions of the quantum code-words 


pi = > pli) | pli) )( pli), (7.166) 


keJj 


with eigenvalues 0 < p(k|i) < 1 and eigenprojectors | p(kli))( p(kli) |, provide 
conditional and joint probabilities. 

Denote by A“, respectively K”, the stochastic variables with outcomes i”, 
respectively k™® = Kijkiy Kin € Tr, where Ik := User Ja™) with JG@™) := 


x7] Jj,. Finally assign to A and K“ conditional and joint probabilities defined 


in 
by TKAM = {P(k™ J hoer kOe where 
n 
Pk i) = I] pk jlij) ; (7.167) 
j=l 
and by tamygm = {PE™, Khoen koem» where 


PG™,k™) = PG) PEM). (7.168) 


Shannon entropies are computed using (7.164) and additivity of von Neumann 
entropy, as follows: 


a 
H(A”) =n H(A) =n > pi logs pi (7.169) 
i=l 
H(A” Vv K™) — H(A”) J 5 PG) H(K™®i™®) 
ier 
=H(A™)+ J PE) SE) 
ier 
a 
= n( H(A) +> pi SD) ; (1.170) 


i=l 


According to the AEP (see Proposition 3.2.2), for any fixed € > 0, we can 
distinguish a subset of m 4a)-typical strings i”) eu” (e y” and a subset of 
T Amygin) -typical strings i”, k”) E€ y™® (e IP x 1”. These subsets are such 
that T4) US”) > 1-—e and T A@)y Bo) (Vi?) > 1 — e. Furthermore, if i” € Uf? 
then 


gn (A)+e) < Pi”) < 27"(H(A)—£) , (7.171) 
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while if @” ,k™) e V®” , then 


a (warts pistes) -n(Hatrhs piS(oi)-) 


< PA, k™) <2 (7.172) 


As for classical capacity (see Sect. 3.2.2), we distinguish one more typical subset, 
Ww” Cc 1” xI ” , consisting of all jointly typical pairs, @”,k™) € yo, where 
also i” €U™. From (7.168), (7.171) and (7.172), 


ynn(swo-r42) z Pi) < ynn(s-x-22) , (7.173) 


where (6.45) has been used and x is Holevo’s x-quantity for p and its decomposition 
p= Yi pipi. Finally, in terms of (7.164) and (7.165), 


PE) = J PEM) | PEOP] 1174 
kMeTE) 

p= YO PEPE | PROKO PE), (7.175) 
imer 


RM es GM) 


where (see (7.166)) | PK |i) ) := Q51 | Pjlis)). 

As showed in the proof of Theorem 7.6.1, for any £ > 0 there exists an orthogonal 
projector I; € BHS”) that commutes with p®” and Tr(p®” Me) > 1 — £. Analo- 
gously, if the sum in (7.175) is restricted to m, one gets a positive operator 


Pn < p®" € BY (H®”) with 


Trpn = ~ PG, k™) 
EK ew”? 


= ( y= E ) Pa, k™) > 1-26, (7.176) 


a kM ey (i) KM ey 
igu 


Theorem 7.6.4 ([164,316]) Let p = Dern Pipi € Bi (H) provide a statistical mix- 
ture of quantum states available for encoding symbols i € I, emitted by a classical 
source; let X := X (p, {pi piliera). For any fixed 6 > 0 and sufficiently large n, there 


exists an encoding i? > £(i™®) = p(i™) on a subset l4 C Ii, consisting of M 
strings and a decoding POVM 


= (H®") > B® _ [fi Wk |g X Wk |) | ely 
EO KO ew” 
Ui- Ð eee) ware |], 


i”) el, 
GO) KM ew 
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such that < 6 and e, < 6, where e, is the decoding error 


M 
a 
n 


1 2 
en:=1- a So PEPE) OPE MM) PEPE) 017 


Mel, 
KM) ese) y 


The strategy of the proof consists of the following steps. 


1. choose an equidistributed set [4 C I 4 of M sequences i™ and to encode each 
of them by a density matrix €(i) = p(@i™). The encoding thus provides the 
density matrix 


1 . 
op = g È ed) 


iela 
1 , , ; 
=a >, Pk? EM) | PAO ET) ) (PRO EM) |. (7.178) 
iMeT, 


kM ez an) 


2. Choose random encodings € as in the proof of Theorem 3.2.3: the argument does 
ensure that the required encoding and decoding procedures exist, but does not 
provide concrete instances of them (for a similar result see [175,177]). 

3. Asregards the decoding protocol, the idea is to try to identify by means of (possibly 
of norm less than 1) vectors | Y (k™® |i™) ) only those | P(k |i) ) in (7.178) 
which are labeled by pairs (i, k™) € w” after the same have been projected 
by I onto the chosen high probability subspace of p). All other | P(k |i”) ) 
will be made correspond to | W(k |i”) ) = 0. 

4. The non-trivial decoding vectors are constructed as follows. (For sake of simplicity 
we shall denote multi-indices i™, k asi and k. Set | ®(i, k)) := Mel P(i, k)), 
consider the matrix S with entries 


Si... = (Pi K| OG, 1) = (Pi) | TPED) , (7.179) 


wherei,i € [4,k,k € J(i), J(@@) and (i, k), (i,k) € w., This matrix is positive 
and its square root defines vectors | ¥ (i, k) ) such that [164] 


VSe wap = (YE IOED) = (WR) | PED). 


Namely, Q~!/*| ®(k|i)) where Q~!/? is the (positive) inverse square-root of 
* * 


Q := > | &(k|i) ){ ®(k|i) | (defined only on the range of Q), where > denotes 


(i,k) (i,k) 
the sum restricted to the pairs (i, k) € we”, 
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5. The error (7.177) is complementary to the ensemble fidelity (see Definition 6.3.6) 
that has been used in Theorem 7.6.1. Using the previous definitions and that 
Q7 1/2 > 0, it can be bounded as follows: 


en < i È PED (1 - (HEIDI PAID?) 
rer) 
2 * 
2 7 5 > PKD VSG h:i - (7.180) 


icf, WK) 


Proof of Theorem 7.6.4 Since 2(x — fx) < x — x? forallx > 0,the same inequal- 
ity holds by substituting x with the positive matrix S in (7.179); taking diagonal 
values: 


3 1 Š 
VSa wiw = 7 SWE T 7 DO S86.0:0,0 SG.0:6.0 - 
(ik) 
Then, using the first inequality in (7.173), the bound (7.180) can conveniently be 
recast as follows 


en < D 


iela (i,k) 


1 aes 
+a DD PRD SimG.0 
icfa GH) 
— x+2e) ; i 
5 3 P (kli) PEA) S,0:G.0 SG.0:4.8 - 


i, jela DEGA 


Suppose now to choose the M words i” € I, randomly according to the probability 
distribution P (i); we obtain in this way a statistical ensemble of random codes and, 
as much as in the classical case, by averaging over the contributions of the randomly 
chosen i” one eliminates the dependence on /4 and remains with a sum over all 
i” e I. Therefore, the average error can be estimated from above by 


* 
e < 2-3 Y YO PH PRD Si.w:G.0 (7.181) 
ier® G5) 
Li 
+ D TEPO PEDS DED (7.182) 
ier® (i,k) 


Laa 
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M(M = 1) jn(s(p)—x426) 
M 


* 
x So SS POPO PHD PEJ) 86.6: SG.0:G,.4) (7-183) 
i jel COAG 


Lap 


From (7.179) and (7.176), 
Ly = Te( pale) = Tr(p®"M1-) — Te(Gn — p™)Me) = 1-32 (1). 


Further, S(jx):(,4) < 1, thus La < 1 (2a). Finally, the last sum can be bounded from 


above by observing that if0 < A < B and 0 < C < D, then by means of cyclicity 
under the trace operation, 


Tr(AC) = Tr(V ACV A) < Tr(AD) = Tr(V DAV D) < Tr(BD) . 


Therefore, since I commutes with p2", Lop < T(G): on the other hand, 


from the quantum AEP we know that the dimension of the subspace projected out 
by Me is < 2nS()+2) with eigenvalues < 2-"(S()—2) whence Lop < 27" S()—32) 
(2b). Altogether, inequalities (1), (2a) and (2b) yield 


ef? <2 — 3(1— 3e) + 1 + MIMO) = 9e + DMARD 


where we have put the growth rate R into evidence M = 2” Ë. The latter can be 
chosen arbitrarily close to the Holevo y quantity and still the average error becomes 
negligible with n —> oo. Therefore, for any ô > 0 and n large enough there is an 
TA ‘= D with R > x — ô and e, < ô. 


Example 7.6.2 Suppose the sender encodes classical symbols į € J into states that 
she obtains by acting locally with unitary operators U; on her system in a state p12 € 
Ma, (C) ® Ma, (C) which she shares with the receiver. She selects the unitary opera- 
tors U; with probabilities p;, and after changing p12 into p; = Ui ® l2 p12 Uj & ll 
she sends her system to the receiver. The sender tries to maximize the information 
accessible to the receiver by optimizing the Holevo bound, thus seeking [90] 


Cm := m [so-Z as] , p=) Pi. 


l 


Note that S (p:i) = S (p12) for unitary transformations do not change the von Neu- 
mann entropy; in order to maximize S (p), consider the marginal states 


p') = Tao), py) = Tri(p) =m (= Trla). 
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By subadditivity (5.170) and (5.165), 
Cu < S (p™) +5 (pr) = S (p12 < log di + S (p2) = S (p12) . 


Choose as unitary operators the a? Weyl operators Wa, (n) of Example 5.4.3 with 
equal probabilities 1/d?; then, using (5.90) and (5.32), one gets 


1 1 
P= -=z 5 Way n) @ Mo p12 Wa, (0) @ Ila = F 82> 
1 n=(n1,n2) 


sothat Cy > log dı + S (p2) — S (p12). This transmission protocol can thus achieve 
an optimal quantum transmission rate Cy = log dı + S (p2) — S (p12). 


In general, like in classical transmission, the quantum states that have been used to 
code and transmit information are subjected to perturbing effects of the transmission 
channel which is being used. Concretely, if the quantum code-words are projec- 
tions P; € Ma(C) chosen with probabilities p;, thus making a statistical ensemble 
described by the density matrix p = }_; pi Pj, a probability preserving channel acts 
on them as a trace-preserving CP map A. In the light of the previous theorem, the 
channel capacity is defined by [176,326] 


CuLA] = max } S (Ale) uP S (ALP, | 
As much as for the entanglement cost (see (6.8)), in order to improve the capacity, one 
may consider n uses of the channel, thus a CP map A®" acting on states on Mg(C)®” 
which may carry entanglement between different uses. Then one introduces the 
regularized capacity 
Co[A] := lim A Cou[A®"]. 
n—>+o0 n 

Such a limit exists because the capacity is superadditive; indeed, consider 
Cmu[A1 ® 42] and two statistical ensembles {p\), PY and ae PO) that 
achieve Cy [41], respectively Cm[ A2]. The additivity of the von Neumann entropy 
over tensor products states implies that, for the not necessarily optimal statistical 
ensemble, tpp Os pi ® Py 


Cul Ai ® 42] > $ (41101) - 2a LP ]) 


+ $(Aalp}) - EPs (a [P®]) = CulAi) + Cul Add . 


Were the capacity additive, the regularized capacity would coincide with Cy[A]. 
This is another important open question in quantum information which is actually 
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equivalent to the additivity of the entanglement of formation [326] (see also [59] 
for an approach to this problem based on the relations between the entanglement of 
formation and the the entropy of a subalgebra.) 


7.7 Quantum Fluctuations and Mesoscopic Physics 


For the description of systems consisting of a large number N of elementary quan- 
tum constituents, it is often convenient to resort to collective observables involving 
all degrees of freedom as they can be directly connected to measurable quantities. 
Collective observables grow with N and thus need to be scaled by suitable powers 
of 1/N in order to obtain physically sensible quantities [324,325]. 

Typical examples of collective observables are mean-field operators obtained as 
averages over all single particle contributions, an example of which is the mean mag- 
netization in spin systems (see (7.3)). Although single particle observables possess 
a quantum character, the greater the number N of constituents, the more classical is 
the behaviour of mean-field observables: they thus become examples of macroscopic 
observables. The well-established mean-field approach to the study of many-body 
systems precisely accounts for their behaviour at this macroscopic, semiclassical 
level, where very little, if none at all, of the microscopic quantum character survives. 

Yet, coherent quantum behaviour can also be found in systems made of a large 
number of particles such as ultracold atoms trapped in optical lattices, hybrid atom- 
photon or optomechanical systems (see the review [32] for a list of references), where 
decoherence effects can hardly be neglected and emerging classicality is ultimately 
expected. Indeed, mean-field observables can not be used to explain such a non- 
classical behaviour. In the following, we introduce a class of differently scaled collec- 
tive observables, known as quantum fluctuations [32,363], whose large-N limit can, 
under suitable conditions, be controlled. They are scaled by 1/./N and thus retain 
quantum properties when N — +00, typically yielding non-commutative algebras. 

Being half-way between microscopic observables related to single particles, and 
macroscopic classical mean-field observables, they provide useful tools for the study 
of quantum many-body properties at an intermediate, mesoscopic scale. This is the 
scale inherent in the quantum behaviour of superconducting macroscopic circuits so 
important in nowadays quantum computing technologies [368], especially in con- 
nection to the generation and persistence of mesoscopic entanglement (see [32]). 


7.7.1 Mean-Field Observables 


In considering quantum systems composed by N (distinguishable) particles, we shall 
make abundant use of the algebraic tools presented in Sect. 7.4. Being distinguishable, 
each particle in the many-body system can be identified by an integer index k € N 
and described by a same C* algebra al! of single-particle observables. Referring 
to different degrees of freedom, operator algebras of different particles commute: 
[ať l, alj 1] = 0,i Æ j. By means of their tensor product one constructs local algebras 
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accommodating finite numbers of them: 
Apa =Q), pq eN, p<q, (7.184) 


and then, by an inductive limit, the quasi-local algebra A that contains all of them. In 
the following, generic elements of A are denoted by capital letters, X, while lower 
case letters, x, represent elements of a, embedded into A as 


x4 =. @1@x@1®..., x in the kth position . 


The states of the system are positive, normalized, linear functionals w on A that we 
shall take to be translationally invariant and spatially clustering (see Definition 7.4.6), 
namely without correlations between far away localized operators with respect to 
the weak-operator topology, 


w(x1) = w(x) sur), FAR, Vuea, (7.185) 
i = t 
Jim w(A r(X) B) = w(A"B) Jim o(7(X)) , (1.186) 


where 7; : A— A, z € Z, are spatial translation automorphisms on A. 

In order to move from a microscopic description based on local algebras to a 
many-body one involving collective operators, a suitable scaling ought to be chosen. 
The simplest example of collective observables are mean-field operators, i.e. averages 
of N copies of a same single site observable x: 


N 
No 1 [k] 
= — b aia 7.187 
JX 180 
k=1 
À 5N) zN) 
Consider the commutator of two such operators, X and Y` ’, constructed from 


single-particle observables x and y: 
N) yN) L 
y y [k] [k] 
[2] = E [o] o], 
j,k=1 k=1 


where the last equality comes from the fact that operators referring to different 
particles commute. Thus, 


zM y] 2 
x“, ¥" | <= . 
[ < £ [rl Il 
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Therefore, commutators of mean-field observables vanish in norm when N —> +00 
so that they can only provide a commutative, hence classical, description of the 
many-body system. Furthermore, for a clustering state w one finds 


lim w (atx B) =w(ATB)w(x), A, BEA. (7.188) 


> Cc 


Indeed, for any integer No < N one can write: 


N 
+N) p\ _ k] (k) 
im w (4% = i l 
mola a B gla ee ee + yg B 
N 2 No+1 

Since local operators are norm dense in A, without loss of generality one can assume 
B to involve particles with labels < No; then, the clustering property (7.186), yields 
(7.188) and XO” tends toa multiple of the identity. With similar manipulations, 


one can also prove that the product x YO of two mean-field-observables weakly 
converges to w(x)w(y) [30]: 
w— lim xy 


Page = w(x)w(y) 1. 


Furthermore, under the stronger Lj -clustering condition (see next Section and [363]), 


> (xy) — wawy) <0, (7.189) 


keN 


the following scaling can be proven [30]: 


h(x ee) — waw) =0 (x) : 


It thus follows that the weak-limit of mean-field observables gives rise to a commu- 
tative (von Neumann) algebra. Therefore, mean-field observables describe macro- 
scopic, classical degrees of freedom, despite their quantum microscopic origin. Being 
interested in collective mesoscopic observables yet retaining a quantum character, a 
less rapid scaling than 1/N is needed. 


7.7.2 Quantum Fluctuations: Mesoscopic Limit 


Quantum fluctuation operators are collective observables that emerge from sums 
of single particle deviations from the average scaled by the inverse square root of 
N. Given any single-particle operator x and a reference state w, its corresponding 
fluctuation operator F ) (x) is 


N 
FY (x) := x y (m — w(x) (7.190) 
k=1 
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in analogy with the fluctuations of classical random variables [137]. 

Although the scaling 1/./N does not in general guarantee convergence in the 
weak operator topology, one can make sense of the large N limit of (7.190) with 
reference to some state-induced topology. Indeed, note that the mean value of the 
fluctuation always vanishes: w(F (N) (x)) = 0. Moreover, one has: 


J (OWP) = fim g X (o(s) =w) 
<> lw (a) = w)?| , (7.191) 
keN 


so that for states satisfying the L1-clustering condition (7.189), the variance of the 
fluctuations is bounded in the limit of large N. In addition, fluctuation operators 
retain a quantum behaviour in the large N limit. Indeed, consider two single-particle 
operators x, y € a and call z € a their commutator. Since [xU], y] = Ojk ZV J 
exactly as in the proof of (7.188), for a clustering state w one has: 


N 
1 
; i (N) (N) — 3 tk) 
n w(A [F (x), F (| B) lim N ) w(A Z B) 


N—>œ 
k=l 
= w(ATB)w(z) VA,BEA. 


Thus, commutators of quantum fluctuations of local operators give rise to mean- 
field observables; as such, they behave for large N as scalar multiples of the identity, 
w(z) 1, providing canonical commutation relations as those in (5.70). 

Thus, at the mesoscopic level, a non-commutative Bosonic algebraic structure 
may naturally emerge, known has quantum fluctuation algebra. 

In order to explicitly construct this algebra, let us fix a set of linearly independent, 
self-adjoint elements {x1, x2, ..., Xn} in the single-particle algebra a and consider 
their real linear span: 


X= [x [xp =r x= Dox rE R}. (1.192) 
p=1 


Then, construct the corresponding fluctuation operators F (N (xu) as in (7.190) cor- 


responding to x, 4 = 1, 2, ..., n, and their linear span: 
N 
FY) (x,) = 5 ry FMa) =r FM). (7.193) 
p=1 


Given a translation invariant state w satisfying the clustering property (7.186), in 
order to build well behaved fluctuations, the discussion leading to (7.191) suggests 
to choose observables x, for which the Lj-clustering property (7.189) is satisfied 
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for all elements of the space X. This condition guaranties that the n x n correlation 
matrix C®? (see (5.119) in Sect. 5.5), with components 


C= lim w( FM) F™(x,)) , pfyv=l,2,...,n, (7.194) 
is well defined [363]. This matrix can be decomposed as 
CH= DO + 50) ; (7.195) 


in terms of the covariance matrix (see the discussion of Gaussian states in Sect. 5.5), 
namely its real, symmetric part ©“, with components given by anticommutators 


1 s N N 
= 5 fim w({ F! xy), Fé of), (7.196) 


and of the symplectic matrix, namely its imaginary, antisymmetric part 0), with 
components given by commutators, 


Indeed, the real n-dimensional space ¥ once endowed with the antisymmetric matrix 
o), assumed for simplicity to be non-degenerate, becomes a symplectic space. As 
such, it supports a Bosonic algebra W (x ; o)), defined as the complex vector space 
generated by the linear span of operators W (r ), with r € R”, obeying the Weyl-like 
algebraic relations (5.80): 


WED WE) = Weep trp) e7702, raga ER", (1.198) 
[wo] =w =[wo]  wo=1. a199 


These operators generate a Weyl algebra W(X ; a), Quasi-free states on it (see 
Sect. 7.4.1, Definition 7.4.3) are particularly tractable: 


2s(w¢)) =e WET ë reR”, (1.200) 


where the covariance matrix X is a positive, symmetric and, together with the sym- 
plectic matrix, satisfies 


spz; (7.201) 


thus assuring the positivity of 25. Quasi-free states thus admit a representation in 
terms of Bose fields [363]. 
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In the GNS representation based on a quasi-free state To, (see Sect. 7.4.2), the 
Weyl operators can be expressed as 


tes|We)] =F, (1.202) 


in terms of n (unbounded) Bose operators F,,, u = 1,2,...,n. They provide an 
explicit expression for the associated covariance matrix by means of their anticom- 
mutators: 


1 
Sw = 52s ( (Fu, F.}) , (7.203) 


while, thanks to the algebraic relation (7.198), their commutators give the symplectic 
matrix: 


oW) = —i[ Fy, Fi]. (7.204) 
The analogy of the relations (7.203) and (7.204) with the results (7.196) and (7.197) 
suggests to consider elements in the quasi-local algebra A obtained by exponentiating 
the fluctuations F? (x,) in (7.193), 

Wr) = PF | (7.205) 
and focus on states w for which the expectation w(wi ) (r)) becomes Gaussian in 
the large N limit. The operators WY) (r ) will be called Weyl-like operators as they 


behave as true Weyl operators only in the large-N limit. Indeed, using the Baker- 
Campbell-Hausdorff formula, 


Wr) wra) = exp {iF Gin) = [EPan POO Gs) 
- S [Fe pe eg) POG) | | 


_ Lage? [Ea FO Cn)] }) dE va | , 


As seen before, in the large N limit, the first commutator on the right hand side tends 
to the identity, while all the additional terms vanish in norm; for instance, one has 


Jim l | FY xp), | F (xn), Pa] | | 


[et fot] 


Therefore, in the large-N limit the Weyl-like operators obey the following algebraic 
relations: 


: 4 2 
< lim —=|lx,, I" lla l = 0. 
N> 


æ J/N 


go m 


Wry ) WY ar) ~ Wr, +r) eT LEM an) FM Gr) ; (7.206) 
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Recalling (7.197), they reduce to the Wey] relations (7.198). More precisely, one has 
the following result [363] 


Theorem 7.7.1 Given the quasi-local algebra A, the real linear vector space X as 
in (7.192), and a clustering state w on A, satisfying the conditions: 


(1) 5 hot a) — w(x, )w(xn) < 00, 11, r2 €R” 
keN 


; F) -ir yo). 
(2) im w(eF wj ae re, PER, 
—> 00 


one can define a Gaussian state 2 on the Weyl algebra W(X, 0) such that, for 
allr; € R",i=1,2,...,m, 


lim w(w™ or) Woy)... W™ (™n)) 
= Q(WED Wo) ... Won) 
with 


lim w(w(r)) tg oe 2(we)) . rer", (7.207) 
N>oo 


The Gaussian state 2 on the algebra W(¥, o™)), with covariance matrix 5“), is 
well defined. First of all, it is normalized as easily seen by setting r = 0 in (7.207). 
Further, its positivity is guaranteed by the positivity of the correlation matrix (7.194): 


CH = yO) 4 50) >0. (7.208) 


Being Gaussian, the state 2 gives rise to a regular representation of the Weyl algebra 
[363] W(X, 0), so that one can introduce the Bose fields F, as in (7.202) and, 
through (7.205) and (7.207), ie. w(e”® w) = 2(e'"*), identify the large N limit 
of local fluctuation operators with those Bose fields: 


Jim FO) Siig Petts (7.209) 


Despite being collective operators, these operators retain a quantum, non-commutative 
character. They describe the behaviour of many-body systems at a level that is half 

way between the microscopic realm of single-particle observables and the macro- 

scopic one of mean-field operators, as discussed earlier. In this respect, the large 

N limit that allows to pass from the local fluctuations in (7.190) to the mesoscopic 

operators belonging to the Weyl algebra W(4’, o), as described by the previous 

Theorem, can be called the mesoscopic limit. It can be given the following operative 

definition. 
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Definition 7.7.1 (Mesoscopic Limit) Given a sequence of operators O‘%) in the 
quasi-local algebra A, we shall say that they tend to the mesoscopic limit O, 
symbolically m — limy_.9 O = O, if and only if, Vr, r2 € R”, 


lim o(W er) OW) wea) = o(wep o Wor)) . 
N>oo 


Remark 7.7.1 The right hand side in the previous limit corresponds to the matrix 
elements of the operator 7@(O) with respect to the two vectors 72(W(r1))|S2), 
m™2Q(W(r2))|@) in the GNS-representation of the Weyl algebra W(X, 0) based 
on the state §2. Since these vectors are dense in the corresponding Hilbert space, 
those matrix elements completely define the operators O. 


A similar approach can be pursued in order to extract a mesoscopic fluctua- 
tion dynamics from the microscopic one. Namely, given a one-parameter family 
of microscopic dynamical maps go) on the quasi-local algebra A, one seeks the 
large N-limit of its action, ®;, on the Weyl-like operators w) (r ), in the limit of 
large N. 


Definition 7.7.2 (Mesoscopic dynamics) 

Given a sequence of one-parameter maps pl N) A> A, it tends to the mesoscopic 
limit ®, on the Weyl algebra W(X, c®), m — limy œ P” = @,, if and only if, 
for allr, ri, r2 € R”, 


lim o(WM er) oY IW ry] Ww) (r9)) Z 
= 2(We) B [We] We). 


Example 7.7.1 As a paradigmatic instance of mesosocopic limit, consider a chain 
of quantum 1/2 spins (see Sect. 7.4.5). At a microscopic level, each single spin is 
described by the operators s1, s2 and s3, obeying the commutation relations [s jp sk] = 
i€jxe Se, j,k, £ = 1, 2, 3. Together with the identity operator so = 1/2, they generate 
the single-spin algebra a = M2(C). The inductive limit of the tensor products of 
single-site algebras from site p to site q, A{p,q, yields the quasi-local algebra A of 
the chain. Let it be equipped with a thermal state wg, at temperature 1/6, constructed 
from the tensor product of single-site thermal states (see Example 5.6.2.1): 


we = Q) ol. (7.210) 
k 


At site k, the state ee is determined by the expectations 
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; 1 ; : 
DENE PPP), 


kK (Ik n pe 
T E 7’ n= um ($) . 


It can be represented by a Gibbs density matrix p! constructed with the site-k 


Hamiltonian A! = e s, so that for any operator xl! € al%; 


— 8h 
e Bh 


Ta - Tr[ ply) x] , pl 


8 ~~ 2 cosh(eB/2) ` ee 


For a chain containing a finite number N of sites, the state wg in (7.210) can similarly 
be represented by a density matrix as 


ay PERM 
Po = ——————_- 
: Tr [= ia j 


Given the single-site spin operators s; and the state wg, one con now construct the 
corresponding fluctuations as in (7.190): 


N 
1 
FO (5) = i X. (at = wa(si)) , ¢=1,2,3. (7.212) 
k=1 


From them, the symplectic matrix 0) in (7.197) can be easily computed; indeed, 
taking into account the tensor product structure of the state wg, it reduces to the 
expectation of the commutator of single-site operators 


so that, explicitly, 


Recalling (7.204), this matrix reproduces the commutators of the Bose operators F; 
obtained as mesoscopic limit of the three fluctuations (7.212); as a result, F3 com- 
mutes with all remaining operators and therefore it represents a classical, collective 
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degree of freedom. On the contrary, the two suitably rescaled operators Ê =F; {sm 
and X = Fy /./7) obey standard canonical commutations: [X, P] =i, from which 


standard Weyl operators W(r) = e!” 1P+r2X) can be defined. The corresponding 
Weyl algebra W(o) is equipped with a quasi-free state 25, 


lim up(ell Ontara) — e- tlet coth(3/2)] 
N>œ ' 
= 2(w¢)), 


which is again a thermal state: it can be represented by a standard Gibbs density 
matrix: 


Tr [eee wo) 1 


2;(Wor)) = ne] -Ha A f?) : 


7.7.3 Mesoscopic Dissipative Dynamics 


We shall now study what kind of dynamics emerges at the mesoscopic level starting 
from a given microscopic dissipative time-evolution (see Sect. 5.6.3) for the elemen- 
tary constituents of the many-body system. Indeed, in actual experimental conditions, 
many-body systems can hardly be considered isolated from their surroundings and 
need to be treated as open quantum systems. 

Within the setting of quantum fluctuation theory as outlined in the previous sec- 
tions, we shall start with a system composed by N particles described by the local 
algebra AÙ?) C A whose microscopic, open dynamics is generated by master equa- 
tions of the GKSL-form in (5.213): 


aX) =L [xm], LYX] = 8 [x] + Dx], (7.213) 


where X € AW) and the first contribution, 


HO [x] = ilu”, x| , (1.214) 


is the purely Hamiltonian one, whose generator H? can be taken to be the sum of 
single-particle Hamiltonians h!“! = (a): 


N 
HY=F a, HMHM, (1.215) 
k=1 


while the term DO” takes into account dissipation and noise: 
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N m 
. 1 ; 
D(X] = D Jee 2 Das (ve X wt - 5 {oft omt, x }) 
k,ł=1 a,ß=1 


E 3 Dag (oH |x, a] $ fai, x] am; (1.216) 
a,ß=1 


k=l 


Nile 


with vik single-particle operators. While the Hamiltonian contribution does not 
contain any interaction among the N particles, in the purely dissipative term D®? the 
mixing action of the operators va is weighted by the coefficients Jke Dap, involving 
in general different particles. Altogether, they form the Kossakowski matrix J @ D; 


in order to ensure the complete positivity (see Sect. 5.2.2) of the generated dynamical 
tL) 
e 


maps p™ = , both J and D must be positive semi-definite. 
In order to deal with an analytically tractable model, we enforce translation invari- 
ance by assigning the same Hamiltonian to each sites h!*] = h, and we further con- 


sider different particle couplings Je of the form 
Jke = J (|k — £|), Jz = J0) > 0. (7.217) 


Also, assume that mixing effects due to the presence of the environment are short- 
range by imposing a fast decay of the couplings of far separated sites along the 
chains 


1 
lim yD [Jke] < 00. (7.218) 


Finally, we shall further require the time-invariance of the reference microscopic 
state w, 


v(m) =w(X) 6 w(L[x]) =0. (7.219) 


We shall now study the large N limit of the dynamics generated by (7.213) when act- 
ing on the elements of the fluctuation algebra as introduced in the previous section. 
We shall see that, when, through D), the effects induced by the environment are 
taken into account, then the mesoscopic dynamics emerging from the local time- 
evolution pi) = fl t > 0, generated by (7.213)-(7.216), is a non-trivial dis- 
sipative semigroup ®, of completely positive maps on the algebra of fluctuations. 


Remark 7.7.2 The fluctuation algebra is constructed by means of the linear span VY 
(see (7.192)) of a selection ofn physically relevant single-particle hermitian operators 
Xu H = 1,2, .. . , n, out of them, one builds the fluctuations FY) (x,) =r. FO) (x) 
and Weyl-like operators W®? (r) = e”F M, In general, there is no guarantee that 


the action of the generator LY on FY) (x,) = ae an F(x) would give a 
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single-particle fluctuation still belonging to X. In order to recover, out of the action 
of LO), a mesoscopic dynamics for the Weyl algebra W(4, 0), namely for the 
large N limit of the algebra generated by W(r), one has to assume the linear span 
X to be mapped into itself by the generator L“): 


n 
LO (ly = WO ey 4 DM pl] = Soy xl, = LS H+D, (7.220) 
v=1 


where H and D aren x n coefficient matrices specifying the action of the Hamilto- 
nian H™) and dissipative D“? contributions on x!*!, Given the microscopic dynam- 
ics, such assumption is not too restrictive: in general, it can be satisfied by suitably 
enlarging the set X of physically relevant single-particle operators. 


With the assumption (7.220), one can show that the mesoscopic dynamics emerging 
from the large N limit of the time evolution p™ , as specified by Definition 7.7.2, 
is again a dissipative semigroup of maps ®; on the Weyl algebra W(4’, a), trans- 
forming Weyl operators into Wey] operators. Maps of this kind are called quasi-free 
and their generic form is as follows [120,280]: 


®,[W(r)] =e" Wer), (7.221) 
with given time-dependent prefactor and parameters r;. In the present case, one finds: 
n=M-r, M =8f, (7.222) 


where £ is the n x n matrix introduced in (7.220), while T represents matrix trans- 
position. Instead, the exponent of the prefactor can be cast in the following form: 


1 
fE) 555r Kier, Ki= SO M, SO MY ; (7.223) 


where ©) is the covariance matrix defined in (7.196). With these definitions, one 
can state the following result (see [29,30] for a proof). 


Theorem 7.7.2 Given the invariant state w on the quasi-local algebra A, the real 
linear vector space X generated by the single-particle operators x, € a and the 


corresponding Weyl-like operators W‘)(r) = eir FO ©, evolving in time with the 
semigroup of maps p™ = fl generated by L™) in (7.213)(7.216) and leaving 
X invariant, the mesoscopic limit 


m= lim oM Wwe)| = p, [Wr)] , 


defines a Gaussian quantum dynamical semigroup {®;};+9 on the Weyl algebra of 
fluctuations W (x, a), explicitly given by (7.221)-(7.223). 
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The mesoscopic evolution maps ®; are clearly unital, i.e. they map the identity 
operator into itself, as it follows by lettingr = 0 in (7.221). In addition, they compose 
as a semigroup; indeed, for all s, t > 0, 


&,06,[W(r)] = eo BrKertre Ken) W((r1)s) 


e} (r-Kiertr( M: Ks M?)-r) W (ris) 


Further, the maps ®; are completely positive, since one can easily check that the 
following condition [120] is satisfied (see Appendix D in [31]): 


LO) 4 or > Mi- (20 + =o) Mı”. 


Thanks to the properties of unitality and complete positivity, the maps ®; obey 
Schwartz-positivity (see Proposition 5.2.2): 


d [xx] > $ [x] 2 [x]. 
Using this property and the unitarity of the Weyl operators W (r ), one further finds: 


| elt) 


=|%[W@]| <IWo|=1. 


As a direct consequence of the time-invariance of the microscopic state w with 
respect to the microscopic dissipative dynamics a), this last result also follows 
from the positivity of the matrix K in (7.223) [30,31]. For the same reason, also 
the mesoscopic Gaussian state © is left invariant by the mesoscopic dynamics ®;; 
indeed, recalling (7.207), one has: 


2 (D1 [WD = ef 2 (Wap) = BK EO 
— e7 ir Krr-zr(M; go M,")-+r = aor er = 2 (W(r)) . 
More in general, given any state Ê on the Weyl algebra W (x ; oa), one defines 


its time-evolution under ®; according to the dual action: Qe Bo @,. For states 
admitting a representation in terms of density matrices, one can then define a dual map 
È, acting on any density matrix p on W (x, od) by sending it into p(t) = CAWN 
according to the duality relation 


T[i we] = Tr] vwo]. 


As already observed, a useful class of states on W (x ; o)) are Gaussian states 2 y: 
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2s(w¢)) = Tr[ os wor)| = e7202) 
with 


1 
[Dw = 5T| px {Fu E i v=1,...,n, 


Being the operators {F,,} the Bosonic operators introduced in (7.202), it turns out 
that W (r) = e"f . These states are completely identified by their covariance matrix 
X such that (5.119) holds and ®; transforms Gaussian states into Gaussian states: 


T(S W)] =e% TH] ps wen] 


_ REZO) x" Ex7)) _ Tr pre wor)| 


with the time-dependent covariance matrix X (t) explicitly given by: 
Z(t) = TO —- MSO M7 +M,ruM,". (7.224) 


From these results, one recovers the time-invariance of the mesoscopic state 92, 
since starting from the initial covariance X = ©), the evolution (7.224) gives: 
Y(t) = DV. 


7.7.4 Mesoscopic Entanglement Through Dissipation 


As discussed at the end of Sect. 6.2, though the presence of an external environment 
typically leads to dissipation and decoherence, suitably engineered environments are 
capable of creating and enhancing quantum correlations among quantum systems 
immersed in them. Indeed, entanglement can be generated solely through the mixing 
structure of the irreversible dynamics, without any direct interaction between the 
quantum systems. We now show that the microscopic generation of entanglement 
can survive at the mesoscopic level providing a means to entangle collective degrees 
of freedom as quantum fluctuations. 

Let us consider two spin-1/2 chains, one next to the other, of the type already 
discussed in Example 7.7.1, both immersed in a common thermal bath at temperature 
T = 1/(. Site k of the double chain system will thus consist of the two sites k in 
the two chains and will support the two spin algebra a = M7(C) & M2(C) which 
is generated by the sixteen products s; ® sj, i, j = 0, 1, 2,3, built with the spin 
operators s1, 52, 53 and so = 1/2. The single-site operators s; ® so and so ® sj, i = 
1, 2, 3, represent single-spin operators, pertaining to the first, respectively the second, 
of the two chains. The tensor products of single-site algebras from site p to site q, 
p < q, as in (7.184), are local algebras A[,,,) and the union of these local algebras 
over all possible finite sets of sites, together with its inductive limit yields the quasi- 
local algebra A that accommodates the physical observables of the two chain system. 

We shall equip A with a thermal state wg, at the bath temperature 1/3, constructed 
from the tensor product of single-site thermal states as in (7.210), wg = Q; w, 
the only non vanishing single-site expectations are then: 
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k] (Lk k n Be 
wf (sh @ sf) = a? ystanh ($) 


As in (7.211), why l can be represented by a Gibbs density matrix py constructed 
with the site-k Hamiltonian 


e- Bhi 


k] _ [k] [k] [k] _ 
h =e(s} @1+18s$"), Po ~ 2ceosh(eß/2) ` 


(7.225) 
Being the product of single-site states, the state wg does not support any correlation 
between the two spin chains; further, it clearly obeys the clustering condition (7.186). 
Following the construction in Sect. 7.7.2, we shall now focus on a subset of all single- 
particle observables, specifically on: 


X1 = S1 Q S0 , X2 = 52050, X3 = S0 OS, , XA = s0 8B S2, (7.226) 
X5 = S1 ® 83 , X6 = S2 Q 83 , X7 = S3 Q S1 , X8 = 53052, (7.227) 


and on the real linear span ¥ (see (7.192)) generated by them. Observe that wg (x„) = 
0, u = 1,2...,8, and further that the first condition in Theorem 7.7.1 is satisfied, 
since it simply reduces to lwg (xr Xr )| < OO. 


Remark 7.7.3 Although there are sixteen single-site observables of the form sj ® 
Sk, J, k = 0, 1, 2, 3, it turns out that the set of local fluctuation operators, 


N 


N 
FMa) = 5 > (x41 - ve) = = Yal, (7.228) 


k=1 


corresponding to the above subset, gives rise to a set of mesoscopic Bosonic oper- 
ators F, whose Weyl algebra commutes with the one generated by the remaining 
eight elements. One can thus restrict to the eight single-site operators in (7.226) 
and (7.227). In addition, note that the pairs of operators x1, x2 and x3, x4 refer to 
observables belonging to the first, respectively second spin chain: as we shall see, 
they provide collective operators associated to two different mesoscopic degrees of 
freedom. 


In order to explicitly construct the fluctuation algebra corresponding to the chosen 


linear span ¥, one needs to compute the correlation matrix C) as defined in (7.194). 
Being wg a product state it follows that 


CO = tim o(FMG@,) FM@,)) = Top xpa|, mv 1,2,....8. 
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These are the entries of an 8 x 8 matrix that can be expressed as a three-fold tensor 
product of 2 x 2 matrices, 


1 
c® = z 2-101) Bb @ (l2 +102) , (7.229) 


where o; are standard Pauli matrices, while 12 is the unit matrix in two dimensions. 
In computing tensor products, we adopt the convention in which the entries of a 
matrix are multiplied by the matrix to its right. Similarly, one easily obtains the 
corresponding covariance matrix, 


fa 1 
rh = 72 - <0) @h@h, (7.230) 
and symplectic matrix, 
o® = -1 ~ 201) @h@or, (7.231) 


so that: CO = 5) 4 ig /2. The symplectic matrix gives the commutator of 
the Bose operators F,,, namely the mesoscopic limits of the fluctuations in (7.228): 
[Fn Fy] = iol). 

H Fy pv 

Assume that the double chain interacts weakly with the environment in which it 
is immersed and that the ensuing dissipative and noisy effects can be described 
by a general master equation of the form (7.213)-(7.216) where for the N-site 
Hamiltonian H? we take the sum of N copies of the single-site ones in (7.225), 
HW) = D a 1 hl*l, while the rest of the generator is constructed using the following 
single-site operators: 


Vi =S_EOS_, VWI=S_ OSL, V3 =53 Q 5S0, V4 = 59 B83, 


where s+ = sı + is2, while for the 4 x 4 matrix D we take: 


D=1 812: +701 8 (2 +01). 


The parameter y needs to satisfy the condition |y| < 1/2 in order for D to be positive 
semi-definite; it encodes the mixing-enhancing power of the environment. With these 
choices, the dissipative part D®? of the generator L®? can be recast in a double 
commutator form, so that one explicitly has: 


N 
LMX] = ic |H @1t+i1e@ st, x] 
k=1 


+5 y Je F Dap| aa e 7.232) 


k,t=1 a,G=1 
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Since operators at different sites commute, the action of this generator on any operator 
xl at site k simplifies to: 


Laai P| = iel 8&1+18 s, a 


on S Daal [ull ai (v our 


a,b=1 


where (7.217) has been used. Then, the linear span ¥ is mapped to itself by the action 
of L); indeed, one finds: LM) [xl K] = y Lw xl, with the 8 x 8 hermitian 
matrix £ explicitly given by: 


LHH+D =-ie1, @1,@02- JO(1s- yo @o1 ® 12) _ (7.233) 


Via the definitions (7.222) and (7.223), with £ as in (7.233), one can now explic- 
itly construct the emergent mesoscopic dynamics ®; on the Weyl algebra of fluc- 
tuations W(X, 0). It amounts to a semigroup of unital, completely positive 
maps, whose generator is at most quadratic in the fluctuation operators F, = 
m — limy-+so0 FG). Indeed, one finds that the map W,(r) = p, [Wr )] = 
ef W(r,) is generated by a master equation of the form ô; W; (r) = L[ Ww; (r HF 
with 


p3 HO FFL, Wi] (7.234) 
pwv=1 
: 1 
+ 2 Pi (r, Wi Fy — 5{FuFv. m)) ; (7.235) 
H v= 


In (7.234), gO represents a Hermitian 8 x 8 matrix and DO) in (7.235) is a posi- 
tive semi-definite 8 x 8 matrix, both expressible in terms of the correlation matrix 
(7.229), the invertible symplectic matrix (7.231) and the matrix in (7.233): 


9 = -il (£ Cc) — cee) (cB)-! | 


DM = (gA)-! (£ CO 4 GaL) (c)- | 


It proves convenient to describe the Weyl algebra W(X, o®)) by means of four- 
mode Bosonic annihilation and creation operators (ai, aj ),i = 1,2, 3, 4, obeying 
canonical commutation relations: 


la T= gl as a E0: 
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One sets F, = ar fai (ai + aj ) , with complex coefficients fr , the non-vanishing 
ones being 


A = if! = f? = ifa = yn, 
fs! = ife! = fP = if? = h, (1.236) 


Te) ee 1) ne 
fs =ife = fr =ifg’ = . 
1-7 


From the first line of (7.232) one deduces that the creation and annihilation operators 
(ai, al) and (a3, aj ), coming from the couples of single-site operators x1, x2 and 
x3, X4, refer to the first, respectively the second chain. In other terms, (a1, al) and 
(a3, aĵ) describe two independent mesoscopic degrees of freedom emerging from 


distinct chains. Instead, (a2, aĵ) and (a4, aj) result from combinations of spin 
operators involving both chains at the same time. 

The fluctuation algebra WY, oP), generated by the Weyl operators W(r) = 
e’’ inherits a quasi-free state Ng from the microscopic state wg; it is defined by the 
covariance matrix ©“) in (7.230), through the following expectation: 


2s(we)) ae rrr pers, 


In the formalism of creation and annihilation operators, the state (23 can be repre- 
sented by the following density matrix, 


e78 H 4 ; 
Px) = Tr (e 2#) F H =€ ajdi , (7.237) 


i=l 


namely by a Gibbs state at inverse temperature 8 with respect to a quadratic hamil- 
tonian H. As discussed earlier, coming from a time-invariant microscopic state wg, 
also this mesoscopic state is invariant under the action of the mesoscopic dynamics. 

The setting is now established for investigating the possibility of bath assisted 
mesoscopic entanglement generation between the two spin chains. By mesoscopic 
entanglement we mean the existence of mesoscopic states carrying non-local, quan- 
tum correlations among the collective operators pertaining to different chains. More 
precisely, we shall focus on the modes (a1, al) and (a3, aj ), that, as already 
observed, are collective degrees of freedom attached to the first, second chain, respec- 
tively. In order to have a non-trivial dynamics, as initial state we shall take the time- 
invariant mesoscopic thermal state in (7.237) further squeezed with a common real 
parameter t along the first and third modes (see Example 5.5.2.1). The resulting 
state is still uncorrelated, but its corresponding covariance matrix re ) being t- 
dependent, is no longer time-invariant; rather, it will follow the general evolution 
given in (7.224). One can now study at any later time ¢ the entanglement content 
of the reduced, two-mode Gaussian state obtained by tracing over the (a2, as) and 


(a4, a) modes. 
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In practice, one needs to focus on the reduced covariance matrix, obtained from 


yl 3) (t) by eliminating rows and columns referring to the second and fourth mode. 
Partial transposition criterion is exhaustive in this case [329], so that entanglement 
is present between the remaining first and third collective modes if the smallest 
symplectic eigenvalue A(t) of the partially transposed two-mode, reduced covariant 
matrix is negative. Actually, the logarithmic negativity, 


E(t) = max fo, —log, an|} , 


gives a measure of the entanglement content of the state [219,336], and it can be 
analytically computed for the model under study [30,31]. One then discovers that 
the dissipative, mesoscopic dynamics ®; generated by (7.235) can indeed produce 
mesoscopic quantum correlations at the level of the collective fluctuation observables 
of the two initially separable infinite spin chains, but also that the amount of created 
entanglement decreases and lasts for shorter times as the initial system tempera- 
ture increases, pointing to a critical temperature, above which no entanglement is 
possible [31]. 


7.8 Quantum Perceptrons 


As seen in Sect. 3.3, artificial Neural Networks are commonly implemented by clas- 
sical algorithms on ordinary computers, however, there has been recently a surge of 
interest towards physical Neural Networks, i.e. implemented on a dedicated hard- 
ware. At the same time, the advent of quantum computation has shown that purely 
quantum mechanical features such as coherence and entanglement allow to address 
hard computational tasks with an exponential improvement of the performances 
compared to classical computation [266]. All this is spurring a line of research, that 
can be loosely termed Quantum Machine Learning [327,379], which explores the 
interaction between Machine Learning and quantum computation, with the aim to 
understand whether the two fields can benefit from each other. 

The simplest model of an artificial neuron has been discussed in Sect. 3.3. Several 
possibilities can be considered to implement a perceptron by means of a quantum 
architecture [56, 195,226,236,371,382]. In this context, it is important to investi- 
gate the capability of a particular model of quantum perceptron to achieve quantum 
advantages with respect to its classical counterparts. 

As seen in the case of a classical perceptron, its main limitation is the fact that 
the classification tasks are performed through a separation of patterns belonging to 
different classes through a hyperplane in the vector space containing the patterns. 
However, when a large number of features is considered, i.e. for patterns in a vector 
space with a large dimension N, given p randomly labeled patterns, by Covers’ 
function counting Theorem 3.3.1, it is extremely unlikely that a perceptron cannot 
classify them if p < 2N for large N [112]. On the contrary, the probability that p 
random labeled patterns can be classified by a simple perceptron becomes vanishingly 
small for p > 2N inthe large N regime. It is clear then that an important parameter to 
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characterize the performances of a perceptron is the ratio a = p/N, and its critical 
value a = 2 has been identified as the critical storage capacity a, of a classical 
perceptron. 

As already touched upon in Sect. 3.3.3, in her seminal work [147], Gardner took 
a statistical approach to the storage capacity of neural networks, adopting tools of 
statistical physics and in particular methods from the theory of disordered systems. In 
the following, we first describe a particular model of quantum perceptron introduced 
in [56] and based on a continuous variable multi-mode quantum system. Then we 
follow the statistical approach to derive its critical storage capacity which we show 
to be always smaller than that of its classical limit. 


7.8.1 A Model of Continuous Quantum Perceptron 


Among the various possible activation functions, the nonlinear function known as 
Rectified Linear Unit, ReLu for short, has recently been much used in learning 
procedures [152] and we shall also use it for our purposes: 


ReLu(z) := zt =max(0,z) WzeER. (7.238) 


The scheme of continuous quantum perceptron we consider in the following [56] is 
based on the use of N Bosonic annihilation and creation operators a;i, aj, such that 
[a;, a) 


į] = 1, or more conveniently by their quadratures 


t T 
A ai + a; A aj — a; 


qi = A > PiS 2 , [di Pj) = i dij ’ (7.239) 
These operators behave as position and momentum observables and have thus con- 
tinuous spectrum as discussed in Sect. 5.4. The position-like pseudo-eigenstates 
Xj; such that ĝ;|x;) = xj |x;), play the same role as does the computational basis 
{|0), | 1 )} in discrete qubit systems and one refers to them as gmodes. Approximat- 
ing the pseudo-kets | x; ) as normalized Gaussians, narrowly peaked around x;, they 
will be used to encode the N components of the input patterns of a classical per- 
ceptron. Then, the tensor product pseudo-ket |x) = ei, |x; ) =| x1, x2,...,xN) 
will encode the input pattern x € R”. 

A quantum circuit implementing the behaviour of a classical perceptron when 
acting on a quantum continuous optical input is represented in Fig. 7.3. A multimode 
input state, that is a common eigenstate |x) of the quadrature operators gj, j = 
1,2,...,N, qj |x) = x;|x), undergoes three successive steps: in the first one each 
component of the input state is rescaled by the gates At which multiply each eigen- 
position x; by areal factor w ;. This effect is achieved by means of squeezing unitary 
operators (see Example 5.4.4) of the form 


S(r) = expli r Gp + pq)) = exp(r (a° — (a")”)), reR, (7.240) 
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Fig. 7.3 Quantum circuit scheme for a continuous valued quantum perceptron 
which are such that 
SMS) =e "4G, SEPSE) = e” p (7.241) 
Sİ (r)a S(r) = a cosh(2r) — at sinh(2r) . (7.242) 
From the first expression in (7.241) it follows that 
S(r)|x)=e"| e 7x) : (7.243) 
Setting w = e7? , the following transformation is obtained: 
|X1,%2,...,XN) > WIW- Wy] WX], W2X2,..., WNXN). (7.244) 


Thus, the strengths w; of the rescaling processes implement the weights of the 


classical perceptron. 


Remarks 7.8.1 


1. The rescaling performed by the squeezing operator S(r) in (7.240) yields 
w =e" > 0. In order to implement negative weights, one can use a sim- 
ple Phase Shift gate R(@) := exp[ida‘a] with phase ¢ = ~. In this case the 


rescaling leads to the compound transformation: 


|x) > Jw|wx) > J/w| — wx). 


(7.245) 


2. The label At for such gates stands for “Attenuation gates”; indeed, usually r > 0 
so that the gates At are used to squeeze the variance of the position-like degrees 
of freedom. Obviously, in order to comply with the Heisenberg indeterminacy 
relations, they would amplify the momentum-like quadrature as in (5.107). 
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In the second step the rescaled eigen-positions are recursively added by means 
of so-called Controlled Addition C X gate. These gates act on two inputs by keeping 
the first one unaltered and adding it to the second one. Using (5.72), they amount to 
the following two-qmode unitary operator: 


i, A 
CX i= exp (741 @ pa) , CX(| x1) @|x2)) =| x1) @| xp +x2). (7.246) 


The combined action on the rescaled state of the N — 1 CX gates of the circuit in 
Fig. 7.3 is then 


| w1x1, W2X2,..., WNXN ) > | w1x1, wW1xX1 + W2xX2,...,WNXN) >... 


... > | WX], WX] + W2X2,..., W- XN). 


Thus, the iterated summation makes the last qmode position-like label equal to the 
weighted sum of the input patterns. 

Using the N-th qmode as a reading mode, a bias b € R can be added to its 
position-like label by means of a displacement operator like in (5.77): 


t 
D(b) = exp (i b p) = exp e) (1.247) 


D'(b)g Dib) =G +b, D'(b)aD(b) =a + = . (7.248) 


Then, from the first equality in (7.248), one derives that the last qmode initial position- 
like eigenstate |x) is finally transformed into 


D(b)|w -x) = |w-x +b), (7.249) 


hence providing the classical affine transformation (3.41). 

The last step consists in implementing the non-linear activation function in 
(7.238): this is achieved by adding an ancillary system, initialized in the pseudo 
position-eigenstate |0), to the initial register consisting of N input pseudo kets. Then, 
a measurement of the POVM (see Definition 5.6.1) given by the pseudo-projector 
operators P, = | y)( y | onto the position-like pseudo eigenstates | y ) is performed 
on the N-th qmode. If the last qmode is in a state 6, such a measurement yields the 
outcome y with probability P(y) = (y | |y). Upon receiving y as measurement 
outcome, using the displacement operator (7.247), the pseudo-position eigenstate of 
the ancilla peaked around x = 0 changes as follows 


10)... y <0 


ly)...y>0° (7.250) 


o> | 
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Remark 7.8.2 Such a conditional action, eventually encodes the final result into a 
state of the ancilla qmode. The continuous model of quantum perceptron presented 
above will then act as a classifier in accordance to whether the ancilla is or is not 
displaced. 


In the next section, we shall consider more realistic scenarios where position pseudo- 
eigenstates are substituted by fairly well localized normalizable states. Consequently, 
the ideal pseudo-eigenstate | y) in (7.250) will also be substituted by the dis- 
placed vacuum state D(y)| 0), where now | 0) denotes the vacuum state, such that 
âtâl0) =0. 


7.8.2 Gaussian Input States 


As already observed, position-like pseudo eigenstates cannot be realized in practice, 
but only approximated by actual qmode states accessible in concrete experimental 
contexts. One way to get close to |x) is by means of square integrable Gaussian 
states centered at x with a very small variance. Then, in order to cope with realistic 
situations, one need substitute the input pseudo-kets | x; ) to the quantum circuit 
(7.3) by Gaussianly weighted normalized superpositions of pseudo-eigenstates of 
the position-like quadrature ĝ ;: 


ajj? 


1 202 
= gya | ane i. gra (7.251) 


The classical input components x; are thus encoded as the centers of Gaussian 
weights with width øj. 


Remark 7.8.3 Using Example 5.4.4, such states can be obtained by squeezing the 
vacuum, (x|0) = (exp(—x?/2)/ 4/7, and then displacing the result: 


1 (x — 6)? 
(x| D(5)S(r)|0) = oa exp ( re ) : (7.252) 


which reduces to (7.251) when e~” = ø; and 6 = xj. Notice that, when r becomes 
large, these states approximate a Dirac delta around xj. 


It then follows that the input state to the quantum circuit in Fig. 7.3 is 


(qj 


n n -x j)? n 
1 Ta 
Yay) =Qiv =a f we i Qia). (1.253) 
j=l 1 12) j=1 


j= 
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Since the measurement procedure that implements the Re Lu activation function 
is performed on the N-th qmode, one is interested in its state py after acting on 
the input N-qmode state | W (x) ) with the quantum circuit in (7.3). The state py is 
the reduced density matrix which results from tracing over the other N — 1 qmode 
degrees of freedom and can be computed as follows. Using (7.244), each of the 
constituent Gaussian wavepackets (7.251) turns into 


(@j-*j 
1 “Ta 
Aw) = a | due 7 arwa) 

j 

Jay _ a 

J 205 
= (102) 1/4 dqj e Pegs 

j 


so that acting with all At and CX gates attenuators and setting the bias b = 0 for 
sake of simplicity, one obtains: 


n (1741 2 (92 =x) (an -Xn 2 
5 [Wj > 4 = 
Fon = T(t dqidqn...dqye 71 e %2...@ N 
j=l j 
x |wigi, wW1q1 + W22, ..., W- q). (7.254) 


By means of the following change of integration variables, 


j x š 
Ps E i+] — Gi 
Gi= lw, =F =1,2,.050, qn = E, (1.255) 

k=l i+] 


the state (7.254) can be recast as: 


wx? (2-4 —w2x2)* 


a dqidq2:-:...dq 
| W(x) ) = zs q2 a s Qwra? e waar x 
Pj= (0073) 
_ Gn-Gy——wy xn) 
2w, 04 ~ x ~ 
xe PAA -1G4,d2,-.-,dw). (7.256) 


Such a state is correctly normalized as required by the unitary action of the various 
gates, the corresponding orthogonal projector Pg) = |W (x))(¥ (x)| being 


@y—wpx1)" (q2-41 -w2x2)* 


, , 
r dqı ...dqy dq; .. -dqy aA T) 
m= e 1I e 272 x 
va) N (no?w?)!/2 
j=l id 
(IN-AN-1 -WN EN? aj-wixD)? a-a] -w212 (anp UNEN 
ae) z2 
. xe 2wy oH e Qwyoy e 2w505 nE 2wy oN x 


X11, 925-+-59N )( qi 955+. ON I - 
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Then, the N qmode reduced density matrix is 


PN = feat sey 4G) otis, we | PE) Cn a E 


2 
(q—w x1)" (@n—1-9N-2-YN-1*N-1) 
dqidqz...dqn-1 et u o 
= N 7 21/2 Ê 171 ee N-1°N-1 x 
[]ja14 0747) 
(Qn -4n—1-N XN) @y ~4nN—1-WN XN)? 
d d f 2w2,02 2w2,02 / 7 257 
x | dqndqye cee ON lan )(qyl- (1.257) 


Then, performing a suitable measurement of the position-like quadrature on the 
transformed N-th qmode yields the value y with probability 


P(y,x) = (yl pn ly) 


, PARC) 
_ fended a a 
Tas row?) !/2 
_ O-4N—1 -WN xn) 
x e “NON : (7.258) 


By first integrating with respect to dqy_ 1, then applying iteratively the relation 


dxe b e d = e b+d 
re 
=e bd 


/ Q=x-@? 2-0)? T _ (y=a~c~z)? 


and re-inserting the bias b one gets 


P(y, x) = 


7 ( 
—— exp N TE (7.259) 
JE? wio ea wr] 


Such a probability distribution corresponds to a normalized Gaussian, centered 
around the classical affine transformation (3.41). Depending on whether the actual 
outcome y of the measurement process is positive or negative, the ancillary qmode 
will then be or not displaced and the final output read out. 


Remark 7.8.4 However, differently from the case of position-like pseudo eigen- 
states, square-summable states may yield all possible y € R as position-like mea- 
surement outcomes which are distributed according to their specific probability dis- 
tributions. Therefore, classification errors are statistically possible; indeed, suppose 
that y > 0 is the correct answer for a given classical input pattern x encoded by the 
quantum state | W(x) ). According to Remark 7.8.2, the ReLu activation function 
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can be used to classify input patterns according to whether the state of the ancilla 
qmode is the vacuum, | 0), or the coherent state D(y)|0) with y > 0. Then, 


0 
Perr Œ) := f P(y,x) dy, (7.260) 


represents the worst error probability, that is the probability of obtaining y < 0 when 
the classical perceptron would yield 


ReLu(b+w-x)=b+w-x>0. 


However, in the limit when all variances ø; vanish 


P(y,x) ~ (y ~(b+w -x) => P(x) ~0. (7.261) 


Example 7.8.1 Consider the AND logical function examined in Example 3.3.1: 
Fig. 7.4 depicts a quantum circuit that might be used to compute it. In order to 
correctly classify the input pairs (x1, x2), the ancilla qmode must output the vacuum 
state | 0) if one or both the inputs x; are negative or the displaced vacuum | y ), with 
y > 0, if both are positive (Table 7.1). 

Could one work with position pseudo-eigenstates, then choosing wy = w2 = 1 
and b = —1 would correctly implements the AN D function as with the classical 
perceptron in Sect. 3.3. Instead, in an actual experimentally feasible context, the use 
of Displaced-Squeezed input states (7.252) unavoidably carries with it the cumulative 
misclassification probability (7.260) according to the following table. 


| 1301) At 
CX 
| X25 02) At D A 
Tı AND T2 
|0) D 


Fig.7.4 Scheme for implementing the AN D by a quantum perceptron 


Table 7.1 Probability of misclassification vs squeezing parameter r 


Input Perr (X1, x2) 


r = 0 (%) 
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Evidently, a correct implementation of the AN D function requires that the input 
states be sufficiently squeezed in order to reduce the probability of errors. Already 
with squeezing factor r = 1, a good implementation is obtained. With this choice of 
squeezing parameter, the worst case scenario has a probability of error of just 0.3%. 
In [56] one finds an analysis of the trade-off between the energy cost of the quantum 
encoding by Gaussian wave-packets and the smalness of the error probability. 


7.8.3 Quantum Storage Capacity: Statistical Approach 


In this section, following Gardner’s classical statistical approach as outlined in 
Sect. 3.3.3, we compute the critical storage capacity of the continuous quantum 
perceptron presented in the previous section, by means of replica trick techniques 
[52]. 

For sake of simplicity, we shall set the bias b = 0 and encode p input patterns by 
means of Gaussian wave functions of same width, 0; = V20 for TSEN: 
Then, the distribution (7.259) of the final measurement outcome becomes 


1 ( -w 2) 
exp : 
J2rollwl| 2.07 ||| 
As emphasized above in Remark 7.8.4, the encoding of the patterns x” into quan- 


tum states makes the classification into one class ¿” = 1, or the other €” = —1, a 
stochastic variable with probability 


P(y,x) = (7.262) 


+00 P 
Ryo wx", E") = f ds Py,xi,o(s) O (e = r) 


= lw] 
+o du 1 w-xl\? K 
= ex u O (E4u — — ) , (7.263 
e r( 1 “int) ) (i= ega 


where ©(-) denotes the Heaviside function which yields 


+œ du 1 w-xl\? 
Ryo (w, x", EH) = den exp (« ) 
ag o Sejo SOT 2 oliwi 


Be es -kje Gy o( 1 (« w- =y’) 
an 2m 2 ollw|| 
_ be —K/ dy e ( 1 (« 4 w. =y) 
deco 2T 2 o|lw|| 
+ ĝen —1 W exp ( : (« a =) 
-œo ~r 2 ollw| 
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“K/o du m 1l 5( rete) 
= —— u 
CHET m siw] 
-ő (-* a =) l (1.264) 


where use has been made of the cumulative probability 


P(x) := du e)? | (7.265) 


Flin 


Weights for a given pattern x” will be selected by asking that the probabilities 
Rx.o (w, x", EF) be sufficiently close to 1: 


Rg o(w, x", E) >1-—e, O<eX<l. (7.266) 


As in the classical case, we shall treat patterns and targets as independent and iden- 
tically equi-distributed binary stochastic variables, with probabilities as in (3.56). 
According to the previous considerations, because of quantum randomness, the rele- 
vant statistical classical quantity, namely the relative volume in weight-space (3.54) 
will be replaced by 


1 
Vy (fe, ef, , K,O, c) =F ia d w6(\lw]|? — N) 


P 
x [] ORno(w, x", &) -1 +6), (7.267) 
p=! 


with the normalization Zy as in (3.55) and where the error parameter £ needs to be 
introduced since, in the quantum case, R,..o¢(w, x", €) cannot be equal to 1 unless 
the Gaussian distribution becomes a Dirac delta. The probability Rx,o (w, x", £”) 
depends on the patterns x”, on the targets £”, on the weights w, on the threshold 
parameter « and on the Gaussian width o. Therefore, the quantum fraction of volume 


ve (r, ey i , K,O, e) depends on patterns, targets, threshold, width and on the 
allowed error €: when the width o vanishes, from the distributional limit (7.261), one 
recovers the expression of the fraction of weights of the classical perceptron (3.54), 


1 H 
va ( {xt PPan na)=z-f dw ô(liwl|? -vjje (e T =r), 


p=! 


an expression which does not depend on e. 


Remark 7.8.5 Allowing too large € would spoil the classification ability of the 
quantum perceptron, the worst meaningful case being « = 1/2. At this error value 
the probability that a weight w gives a wrong or right classification is the same. Note, 
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however, that £ does not represent the actual error performed by the perceptron, but 
just an upper bound to it. Indeed, were £ a measure of how much the perceptron fails, 
then £ > 1/2 would imply that wrong classifications are more likely than correct 
ones. However, this does not imply that the perceptron gets completely unreliable. 
In fact, knowing that the perceptron is wrong more than half of the times, one can 
classify patterns choosing the opposite class with respect to that prescribed by the 
perceptron. 


Then, as in (3.57), the replica trick allows one to compute the large N behaviour of 
Q p 
(log Vy ({x", "a ,K,O, Nee 


i (vi (txt, VP o, Nhe i] 


n—=>0t n 


lim tos ((VsF (1f, Ya P ie (7.268) 


n—>0+ n 


by choosing n an integer and considering the quantum relative volume in the weight- 
space of n replicas given by 


Q u euP ))') _ os nN (n) 2 
(Vie (fx, eH mo, ee Z Jan OW EW MP) 


n p 
x (1 I] O(Re.o (w, x", EH) — 1+ o) , (7.269) 
y=1l p=1 x,€ 


where the subscript y enumerates the replicas and, for sake of compactness, we set 
W =(w1,...,W,) € R’, and 


n n 
Yw = [aw 5X? (iwi?) = [| dw? — N) (7.270) 
y=1 y=1 


In the following section, based on the statistical approach of Sect. 3.3.3 that led to 
the definition of the classical storage capacity a, in (3.58), the following expression 
for the quantum storage capacity is derived: 


+00 = 
a? = p Dt TI , (7.271) 
-k 
where the classical stability parameter « has changed into the quantum one 
Ri=Kk+o@'1-o6, (7.272) 


with T! the inverse of the cumulative probability in 7.265. One then observes 
the following facts: 
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1. the quantum parameter X is always > «x. This means that, at the level of a simple, 
namely a one-layer quantum perceptron, its classification performances are always 
worse than for a classical one and no quantum advantages are to be expected with- 
out going to at least two-layer ones. The reasons for such a negative result is that 
the fuzziness introduced by non-ideal quantum encoding, o # 0, and non-sharp 
quantum measurements that ask to control probability via the parameter e€, are 
sources of noise. Apparently, these are not counterbalanced by linear superposi- 
tions or by non-local correlations, at least at the level of the model considered. 
In other, discrete models of quantum perceptrons [236] the applicability of the 
statistical approach to its capacity is not as evident and direct as in the continuous 
case discussed above. 

2. when the Gaussian dispersion vanishes, o —> OT, so that, as seen in Remark 7.8.4, 
Eq. (7.261), the quantum encoding becomes ideally the classical one, then > «K 
and one retrieves the classical storage capacity (3.58); 

3. the same result, namely classical classification performances, holds when e€ = 
1/2, in which case, as observed in Remark 7.8.5, all weights w have the same 
probability of yielding a correct or wrong classification, meaning that the only 
constraint in selecting the relative volume of weights is the classical stability 
parameter. 


7.8.4 Quantum Storage Capacity: Explicit Computation 


Following [56], we now provide a detailed evaluation of the large N behaviour of 
n 
the quantity (ve (tx, a fe , K,O, e)) ) f in (7.269). Though rather straight- 
x, 
forward, the procedure is lengthy: the reason to present it here is to provide all those 


technical steps necessary to perform it in more general quantum contexts. We start 
by recasting 


(R (Eelam) hg - [or BY UWI?) ACW), (7.273) 


with 


p n 
A(W) := (1 | [ OReo wy, x", EH) — 1+ ») (7.274) 


H=1y=1 x,€ 


Using the distributional expression of the Heaviside function in (7.274) in terms of 
the Dirac delta; 


+00 +00 +œdy 
O(x — u) = f dÀ (A — x) = f dà J Sp VOT (7.275) 
u u =o 27 
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one writes 


+00 +00 H i 
O(Ra,o Wy, x", €E) -1+6 = / dt I DI pYA Rosa 6), 
l-e k -œo 2T 
(7.276) 
so that 


1 A p n „HHE 
A(W) = — I d"? A / dP Yet Laat L110" BY, W), (7.277) 
(27)"? Ji Re 


—€,+00)"P 


where we introduced the symbols 


d" A := Il Tex. d'?Y := Il [J avt f 


u=1 7=1 pales 


and the quantity 


p n 
B(Y, W) := (1 Ile pee l (7.278) 
p=! yl 


EE 


Then, using (7.264) and the exponential representation of the Dirac delta, the assumed 
statistical independence of patterns and targets with different indices allows one to 
rewrite: 


H. 
BY, W) = (i [e (- ae z reine) 
xl, €F 


p=ly=1 ; 
H 
dng a(n Te =) eT OCH) 
-(IITf IWlo 
p=ly= = 
1 
= —__ d? 2 d"?H C(2,W 
(27)"P Ja. ( ) i 
sao EI (EAEE a2 
p=1 y=1 H=ly=1 
with 2 = {wh}, H = (Nu 


Pon 
dP Qa”? H = | | | [dotant, 


p=lyesl 
and 
p n u 
x’. W, 
ce,w):=] | (ap igh) wt L ) : (7.280) 
pal y=1 oN 


x,€ 
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Patterns and targets are assumed to have statistically independent and identically dis- 
tributed components +1. Then, using (3.56), the two averages can be easily computed 
yielding 


p N n : 
= TTT (e ON 


At leading order for N > 1, 


PpP N n „H 
C(2,W) = I] [ [er log cos yo 
p=1 j=1 y=1 ov N 


N 1 L wy jw” 
= exp } lo 1 ae | 
| | P| g z 2 s/N 


N 
= [Jep [10 (1 -aap Leh En | 


j=l p=1 
l n N 

= ep | -37H S wtwe D> wjwgy |» (7.281) 
7,8=1 j=l 


with correction terms that vanish as O (1 / JN N). Using that Iwl? = N and setting 
(see (3.62) in Sect. 3.3.3) 


1 
p= 5 X wy, j Wj (7.282) 


for y, 8 =1,...,n, it follows that qy = 1, while regrouping the coefficients q¥ 
into a n(n — 1)/2 dimensional real vector Q = {q,g}, one can write 


N 
1 
OM, n ~ wa aQ dl 4 e 7 N 2 wad Bj 


x exp | — a (Dt +2 > whutays |). (1.283) 
y>ß=1 
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Therefore, at leading order in 1/N, (7.279) reads 


n 


d”? Q d? H 
B(Y, W) = i 2e a dg || 6 
Rap (27)"P Run-1/2 
y>ß=1 


x 1 (ep = Lor J= DG ages, (7.284) 
1 4 
x exp ( = (ur +2 5 qyp wfutt))). (7.285) 


y=1 y#ß=1 


448 — >? Wy, j WB, j 


Notice that the integrals involving different pattern indices u = 1,2,..., p are the 
same; then, the large N behaviour of (7.277) can be written as 


n N 
1 
AW) = fag I] ô (e T N 3 meas) 
y>ß=1 j=l 


p 
«(som [8d fo, straint exp(K (n, A.3.42.0))) (7.286) 
=E, oo) n 


where we set n = {m}, y = {yy}, A = {Aq}, w = {wy}, fora, 8, y =1,...,n, 


da=] |a, 4 i= [on dm= Ton de = [I 8 


and 


Km Ayw D= iD) [anm -(E+m)uy] 0287 
y=1 


za (Det +2 y WgWy G3 


y#ß=1 
Let us write the Dirac deltas appearing in (7.273) as 
shiwi- N) =f ee ene Dja wa, (1.288) 
and those in (7.286), as 
6 ao ee EE “= dEye Naya FaptiFya Dh wy, jwa 
hb N 2 Wwy, jwg, | = nf 7m © í . 


(7.289) 
Then, inserting them into (7.273), one finally arrives at the following explicit integral 
expression for the large N leading order 
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(ve (fw. e} ave) ")e 


f n N n 
(nye /2 m Pa 
y=1 j=1 y<ß=1 


N 
E E 
exp (iN + i ow?) x 
j=l 


N 
x exp ( — iNqygFy3 + iF yg a Wj ws) x 
j=1 


+00 1 P 
d f EF f d”A 1 aryaindwernrrao)| . (7.290) 
l-e T [1—e,+00)" R3” 


By regrouping the integrals over wj, with different indexes j and same y, 


n?/2 gee dE dF. 
Q u ėuP ý P N Y 78 
(vs ({x 6 Jai + Rs É e)) Jes ~ ZN e I] IT 4r dang Qn 
y=1 y<ß=1 
n n N 
x| exp ( - =) E- i 5 Pans) | 
=] y<ß=1 
N 
n n 2 
l 2 
a Í. lI dwy exp ( 1 5 Fy gw we + 7 X Ey w?) 
y=1 y<ß=1 y=1 


1 P 
x zl "af d”yd”nd”w exp (K (n, A, y, w, Q, »| : 
[1—€,+00)" n 


Then, using (7.275), 


(ve (Ee eiea) he2 


2 n n 
Nv? dE, dF, 
a = dg.g— 7.291 
ZN f, II 4r fa, I] 46 2r ( ) 
y= y>ß=1 
x exp (N GŒ,F, Q)) (7.292) 


where E = {Ey}0_15 F= {Ee} >= and 


G(E, F, Q) = = G1(Q) + G2(E, F) + G3(E, F,Q), (7.293) 
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with 
“ dà Z dydy dw 
G1(Q) = log f D f pic anl tai A 
[1—e, +00)” I 2r I Qn 
x exp (K (n, A, y, w, 0) , (7.294) 


G2(E, F) = log I. Jeo i$. Fpwywg + Lee , (7.295) 


y<ß 


G3(E, F, Q) = 3% =i} Pyare - (7.296) 


y>ß 


n 
When N is large, the leading behaviour of (v? (i {x!, Py LKO, e)) ) i can 
x, 


be retrieved by saddle-point techniques: setting z = (E, F, Q) € cr, one expands 
OG (29 
G(z) around z° = (E°, F°, Q?) such that =e ) o 
Zk 


PZG dg 
Oz j OZ 


Ga) ~ G@) ++ >> 


; — z9) (Zk ze) . 


Let G” (zł) := [Sa | be the n? x n? Hessian matrix at the stationary point. If 


such a matrix is not positive semi-definite, by suitably deforming the integration paths 
into the complex domain, one can perform n? Gaussian integrations by rescaling with 
VN the corresponding integration variables and approximate: 


2 


r 
(PE (meea h= ae (ee) i 


(7.297) 


According to (7.268), one need control the behaviour of the ratio 


log (vg y (B, ole pete Dhe 


for n > OT and N > +oo. From (7.297) and using (3.55) one gets: 


L log ((Vi? (et EHP Kyo, Nee 2 Laad) = 5 log(2re) 


o{ (7.298) 
+ (= : J l 
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Making use of the replica-symmetric ansatz which states that the stationary point Zo 
is replica-insensitive, one seeks it by setting 


qB =q, Fyp=F, Ey=E, (7.299) 


so that (7.287) becomes 


n 


K (9, A,¥,@.9) =i S [Ay — Py) — (E +m) wy] 
y=1 
2 


a _ ) n n 
tee Lie —s5 | do} - (7.300) 
= 


202 


Let us now focus upon the argument of the logarithm in (7.294): 


awf (Se) Lo) fT 
n) I= — ee a w. 
H—e,+00)" (oi 2m J JR» (yi 2m R 1 


n n 


. OK d-gy 
x exp iD (Ay - OM) = iD (2 +m)o- D w? 
y=1 y=1 y= 
q n 2 
-aZe 
y=1 
Such a quantity can then be manipulated using the Gaussian representation 
n n 
q Vd 
exp — 552 a = [ vrexp — I Do ; (7.301) 
qy=1 qy=1 
where the following useful notation has been introduced: 
pra ( 5) (1.302) 
= exp(—-~]. : 
V 2T p 2 


Via integration over the variables wą, one then writes 


ndà dnd n 
ain = f PISS) S, (me) eeo 
[l-e +00)? \ 4 27 Ra NG 2r 


y=! 


2 
7 8 
2no2 K+ om +t/4) 
x [ Dr I] = exp ( : 
R ya] 1-q 2(1 — q) 
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which consists of n independent integrations with respect to d7,, dA, and dy, for 
each replica. Therefore, 


A(n) = 


(s t+on+ IVa) 
2(1 — q) 


= adn n 
- [> ena) eG 


) 1+] exp 


Due to its monotonicity, the function (x) is invertible so that 
o|o (2) -1+ e] =0[n-ob"(1-0)] . 
o 


Then, by changing the integration variable ņ into ÀA = o7 + «s, one gets 


tee dà Atiya) i 
wove fl] | 


exp 
+08-! (1-6 W201 — q) 2(1 — q) 


ga ed o 


where 


The replica trick consists in letting n vanish as a continuous quantity; then, we can 
use the first order approximation z” ~ 1 + n log z, valid for n — 0, and write: 


ain) =1+n for log] 1- (SX) 


The following behavior for the function G;(q) thus ensues when n > OT: 


Gi(g) = log (1+ 9(q)) =n g(a). (7.303) 
where 
t/qtk 
:= | Drl 1- ð| ——— ; 7.304 
son orep (ow 


Similarly, G2 (E, F) in (7.295) can be recast as 


n n 
, _E 
G2(E, F) = log [ates (i > Fuuytiz Jou?) , (7.305) 


y<p=1 y=! 
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where now w = (w1, ..., Wn). Then, rewriting 


n 


n 1 2 n 
E ww i| (£w) -Lw] 


y<ß=1 y=1 qy=1 
one finds 
Fig y~ F-E% , 
ir D muig D= i(i ead, 
y¥<B=1 y=1 y=1 
and, using (7.301), 
n 
exp (i X F wywg + iz 7) 
qy<ßp=1 =1 
n n 
- F-E% , 
= f o exp | —(v iF D iz Dy ; 
y=1 y=] 


After straightforward Gaussian integration, 


Go(F, E) = log Dt dw exp| —tviFw-—i w 
R R 2 
on Qn _ (ti F}? 


n 2T 1 F 
= -log - log{1l—n7 : 
2 i(F — E) 2 F-E 


so that, at leading order in n —> OT, one then gets 


2 
Go(F, E, L) ~ = (oe T a mF al =) (7.306) 


Finally, the replica symmetry ansatz (7.299) yields the following leading order 
behaviour for (7.296), 


LE n(n — 1) . 1 
G3(E, F,q) = its iFq 7 io in E Fq). (7.307) 


Inserting (7.303), (7.306) and (7.307) into (7.293), we get at leading orderinn —> OT, 
with pattern to weight dimension ratio a := N’ 


2T T F 
i(F — E) F—E 


+ iq F — E) 
(7.308) 


n 
G(E, F,q) = nag(q)+ 5 (toe 
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As a consequence, (7.298) yields 


1 Qu ch n an as o1 
ee(n) _ = 090) ~ 5 lox@re) 


2 F-E N 


The stationary point z° is then found by asking that 


ðG) G) _ r 
ðE OF | 


thus yielding the stationary point components: 


0 iq o_ , 1-24 


So Bag 
0-47 KETE 


In turn, from (7.308) one obtains 
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d 2T F Me , log N 
' (108 G57 b iq F D) | o( ie (7.309) 


G(E°, F°, q) 1 1 
—__—_—_ + =~{ log(27(1 — ——}. 7.310 
i aga) + z(a- + Gry). 0310 
The sought after stationary point of G (z) is finally obtained by setting 
0,G(E°, F°,q) =0. (7.311) 
Using (7.304), one explicitly computes 
(t/q+hk)? 
&xP \ —“3d=9) ) t/q+k 
Og 9(q) = [> ( v (7.312) 


6 (GB) A 


As observed before Remark 3.3.4, the optimal storage capacity is obtained when 
q in (7.282), namely the average overlap of random weights, tends to 1. Then, the 


argument of the cumulative probability ®(x) in (7.304) (see (7.265)) tends to +00. 
Expressing the cumulative probability in terms of the error function, 
1 fi 2 2 [* 
ax) = 1+ erf(x//2) erf) := — f dte” , (1.313) 
2 Jt Jo 


use the asymptotic behaviours 


e™ /1 1 
erf(x) ~ +1 when x — oo. (7.314) 
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It thus follows that 


(JGthyY\ ox Jqtii os 
— n = fa 2(1-4@) ) 2n J/1=—q te x/q + h 0 “i (7.315) 
X/SQ+K ~ 
1- o (*¥E*) 1 i xVG bE <0 


Therefore, for q — 17, the vanishing Gaussian terms in (7.312) can only be com- 
pensated if x /q + & = 0, so that 


TOO t/q@th. t/qti 
Og g( =- f Dt : (7.316) 
— -iya Vi-q * JI-q 


Finally, (7.310) and (7.311) yield 


ay 9(q) =- (7.317) 


ee es 
21 — q)? ` 
Thus, from 


1ST] 2/400) 2- q) 


and (7.316) one retrieves that the quantum storage capacity a£ must fulfill the 
equality 


9 atk t t/qt+k 


+00 
o2 | D(t +k) =1 
-K 
that follows from the leading term in (1 — qg)~! when q —> 17 and yields the expres- 
sion (7.271) of the quantum storage capacity. 


7.8.4.1 Bibliographical Notes 

A most exhaustive and complete review of the algebraic approach to quantum sta- 
tistical mechanics is provided by [80]; in [81] one finds plenty of applications to 
spin and continuous systems. A fully developed mathematical theory of the canon- 
ical commutation and anti-commutation relations, quasi-free states and quasi-free 
automorphisms can be found in [240—242, 280, 302-304]. 

Quantum ergodicity and mixing are presented in [80, 129, 307,328,353] in increas- 
ing order of mathematical sophistication; the second reference has provided most of 
the material of this book concerning these topics, the first one concerning mixing. 
In [80] one also finds a detailed discussion of decomposition theory, while in [353] 
more recent developments are presented. In [257,263] what has been called mix- 
ing in this book is termed clustering, the qualification mixing being assigned to a 
stronger clustering behavior which is discussed in connection with Galilei-invariant 
interactions [260]. In [129,353] one finds enlightening discussions about the physi- 
cal meaning of the different algebraic factor types and of Tomita-Takesaki modular 
theory. 
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Applications of quantum mechanics with infinite degrees of freedom to collec- 
tive phenomena and thermodynamics can be found in [324], while in [342,343] the 
emphasis is more on symmetry breaking phenomena and on the existence of inequiv- 
alent representations of the CCR and CCR with applications to physically relevant 
models. The book [363] offers an excellent review of quantum fluctuation theory and 
its applications in very many contexts, all of them however referring to reversible 
microscopic dynamics, while some instances of dissipative ones are discussed in the 
review [32]. 

Quantum information related issues involving infinitely many degrees of freedom 
and the necessary mathematical tools like quantum compression theorems and quan- 
tum capacities are discussed in [176,282,295]. For a review of different formulations 
of quantum capacity related quantities and their relations see [216]. 


Part Ill 


Quantum Dynamical Entropies 
and Complexities 


The last part of the book first deals with two extensions of the Kolmogorov dynam- 
ical entropy to quantum systems and with their applications. Then, it discusses 
some generalizations to quantum systems of classical algorithmic complexity. 


A) 


Check for 
updates 


Quantum Dynamical Entropies 


The first part of this book has been devoted to illustrate some of the many properties 
of the classical dynamical entropy of Kolmogorov and Sinai; in particular, it has been 
showed that it provides the optimal compression rate of ergodic sources (Shannon- 
Mc Millan-Breiman Theorem 3.2.1), while, through the positive Lyapounov expo- 
nents (Pesin’s Theorem), it measures the dynamical instability of classical dynami- 
cal systems; finally, it gives the complexity rate of almost all trajectories of ergodic 
dynamical systems (Brudno’s Theorem 4.2.1). 

Several extensions of the KS entropy to quantum dynamical systems can be 
found in the mathematical and physical literature (see the bibliographical notes at 
the end of this chapter). All of them predated or were developed independently of 
quantum information; due to its rapid growth, the latter more and more appears as an 
ideal ground for testing the physical meaning and the technical usefulness of these 
proposals. 

One of the aims behind the attempts at defining quantum dynamical entropies was 
the possibility of classifying quantum dynamical systems, as much as it had been done 
for classical dynamical systems by means of the KS entropy (see Remark 3.1.2). 
Afterwards, the quantum dynamical entropies have been applied to the study of 
quantum chaotic phenomena and the quantum/classical correspondence; recently, 
they have been used to shed light on certain foundational aspects of quantum infor- 
mation, like quantum capacity and quantum algorithmic complexity. 

Of the various quantum dynamical entropies that have been proposed in recent 
years, we shall mainly focus upon two of them; namely, the entropy of Connes, 
Narnhofer and Thirring [108] (CNT entropy) and of Alicki and Fannes [10] (AFL 
entropy).! These quantum dynamical entropies embody two radically different ways 


! The L in AFL stands for Lindblad who introduced the notion in [230-232]. 
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of approaching the notion of information production in quantum mechanics; indeed, 
they may behave differently on a same quantum dynamical system. 

In Sect.2.4, we have seen that partitions of the phase-space into finitely many, 
disjoint measurable atoms provide classical dynamical systems (4, T, u) (see Defi- 
nition 2.2.2) with symbolic models: finite measurable partitions P = { P;};<c7 can be 
used to successively localize the moving phase-point within their atoms P; and to 
quantify the predictability of the dynamics via the information relative to the next 
time-step that is gained by observing the evolving system. 

In Chap. 3, partitions have been interpreted as POVMs taken from a commutative 
dynamical triple (Ly (X), Or, wu), where atoms have been identified with their 
characteristic functions and thus with orthogonal projections summing up to the 
identity (see Definition 2.2.3.2). Thus, partitions P define partitions of unit (see 
Definition 5.6.1) and CPU maps E x on the C*-algebra (Li, (¥)). However, because 
of commutativity, the action of E on L (X) reduces to the identity map; indeed, for 
all f € Lr (x), 


ExLfI@) = Do w f@) xe) = Do xE) f@) = fa). 


icl iel 


The dynamics is thus insensitive to measurements, Or o Ex = Ey 0 Or = Or, as 
well as the states on Lr (X): Fy [wy] = wp, where Fy is the dual CP map such that 
Fxlwpl(f) = w(ExLf)). 

Given a quantum dynamical triplet (A, ©, w), if one wants to extend the notion 
of partition to such a non-commutative context, a natural step is to substitute 
commuting projections with non-commuting ones or, more in general, with non- 
projective POVMs . Differently from the classical case, the CPU maps E: A œ> A 
associated with them do not in general commute with the quantum dynamics, 
O 0 E £0 £ Eo Ø, nor do the dual maps F preserve the quantum state, F[w] 4 w. 
Both E and F act as external perturbations; therefore, a preliminary question arises 
whether one should or not incorporate measurement processes into the very con- 
struction of quantum dynamical entropies. 

If the answer is yes, then, beside the dynamics itself, measurement processes 
themselves may act as a source of randomness; on the other hand, if the answer is 
no, the regular and irregular features of the dynamics refer to the system only, but 
are insensitive to the typical quantum phenomenon that getting information about 
quantum systems in general perturbs them. In other words, a perturbation-free quan- 
tifier of quantum dynamical randomness might not measure the actual information 
production that always comes from observations of the time-evolving system; on the 
other hand, a quantifier of quantum dynamical randomness that takes into account 
acquisition of information through measurement processes would add the random- 
ness coming from the latter ones to that proper to the quantum dynamics itself. 

Purpose of this and the last chapter is to convey the idea that, unlike in classical 
dynamics, randomness in quantum dynamics has more than one facet and that choos- 
ing one of the two answers above just means exploring two of these in equivalent 
aspects. 
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The non-commutative algebraic structure which more closely resembles a commu- 
tative one is that of type Iı factor von Neumann algebras A (see point 2 after 
Definition 7.3.1). Indeed, the state w which makes A a type JJ; factor is a normal- 
ized trace such w(X Y) = w(Y X) for all X, Y € A. The CNT entropy [108] gen- 
eralizes to generic von Neumann algebras previous extensions of the KS entropy 
to type IJ, factors that were based on the above commutativity with respect to the 
state [109, 129]. 

The CNT entropy quantifies the information rate in quantum dynamical systems 
described by algebraic triplets (A, ©, w) and it does it independently of external 
measurement processes, by relying only on the algebraic properties of A, © and w. 
The basic idea of the whole construction is a clever use of the relation between entropy 
and relative entropy that has been discussed in Sect. 6.3.1 in relation to the entropy 
of a subalgebra (more in general of a CPU map: see Definitions 6.3.2 and 6.3.3). 
Before getting to that, we indicate why, in general, the steps that in Sect.3.1 led to 
the KS entropy are not practicable in a quantum setting. 

Suppose M C A is a finite-dimensional subalgebra; if (A, ©, w) is a classical 
dynamical triplet, the KS entropy is constructed by considering 


e the finite partition corresponding to the subalgebra M; 

e the partition M” = Va @*(M) generated by the time-evolved partitions 
{O*M o; 

e the Shannon entropy of the state w restricted to M™: H (w}M™®); 


1 
e the asymptotic rate lim -H (w}M™). 
n> n 
In the non-commutative context, by sheer analogy one might consider 


e any finite-dimensional subalgebra M C A; 

e the finite-dimensional subalgebras M“” := vizi @*(M) generated by the n 
(finite-dimensional) subalgebras ©* (M ee 

e the von Neumann entropy of the restricted state wIM™: S (w IM w); 


1 
e the asymptotic rate lim —S (vm "i: 
n>œ n 


However, this argument generally fails and the reason why it does fail is non- 
commutativity [354]: despite being finite-dimensional, the subalgebras at different 
times ©% (M) need not commute and can thus generate an infinite-dimensional sub- 
algebra M so that S (wM (0) cannot in general be controlled. 

In order to overcome these difficulties, the idea is to extend the entropy of a CPU 
map H, (7) (see Definition 6.3.3) to the entropy Hu (71, %2, ---» Yn) of n CPU maps 
Ji : Mi & A from finite-dimensional C* algebras (that we shall always suppose 
with identity) M; into A. 
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Remark 8.1.1 As in Sect. 6.3.1, when M is a subalgebra of A, then one chooses as 
CPU map 7 the natural embedding ty of M into A. Therefore, when dealing with 
the set of subalgebras {@/ (M) Zo the n CPU maps are given by 7; := O7 o iy. 


Like in the case of H, (y), we shall consider linear convex decompositions of 
the state w in terms of states wjn. These states will now be indexed by strings 
iM = iji2+--in, ij € Ij, each CPU map yj being associated with a generic index 
set Ij, carrying a total weight A(i )). Fixing i; € Ij, after summing over the all 
other indices and after renormalization, one obtains auxiliary decompositions of w 
associated to each yj. Concretely, from the multi-index decomposition 


L woww, IMs hx bx dy, (8.1) 
ier 
one obtains subdecompositions w = bD A : wt ,j=1,2,...,n, where 
J J 
ijelj 
Ajo) 
w= J) F uw, = > w. (8.2) 
i0) ri, i(n) 
ij fixed ij fixed 


Let AM := | A i eri , be the probability distribution associated with the weights 


in (8.1)and Aj := hi i él; the marginal probability distributions consisting of the 
weights in (8.2). The generalization of (6.3.3) is as follows. 


Definition 8.1.1 (n-CPU Entropies) Given a C* algebra A equipped with a state w, 
lety; : Mi C Aji = 1,2,...,, be CPU maps from finite-dimensional C* algebras 
into A. Their entropy with respect to w is: 


n 
Ho 15 V2; ---, Yn) = sup HCA) ~ Hcp 
w=} in) Ai) Win) i= 


+E S (uhm, ori) (8.3) 


j=! ijelj 


= sup {HA -J (Aj) 
j=l 


w=} in) A;n) W) 


= als (ol w))} , (8.4) 
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where, with n(x) := —x log x, x € [0, 1], 


H(A) =>" Aw), HA)= > nA) l 


i™ ijelj 


As in Sect. 6.3.1, a concrete way to construct decompositions of w is to use the 
GNS construction based on the state w; it follows that the states Win) contributing to 
w= Djo Aj) are in one-to-one correspondence with the positive elements of 
the commutant of my (A): 


Ajin wy (X) = (2u | Yi Tw (X) |Qu), (8.5) 
where 0 < Y'm € Tu( A) and X` 5m) Y'a = 1. Also, if w is faithful, one can express 
L L 


the decomposing states in terms of 0 < X € Tu (A)” by means of the modular auto- 
morphism a, (see (5.191) in Remark 5.6.1.3): 


Nowo (X) = (Ru oil? (T) ) O [Qu (8.6) 


where 0 < Y;a € Aand } jw Ym = 1. 
In analogy with (6.59), we shall denote by 


pei (Dim , Wi) }) = H(A”) — > H(Aj) 
j=l 
tE Lasen). 6D 


j=! ijelj 


the contribution to the n-subalgebra entropy coming from a chosen decomposition 
of the state w. 

The n-CPU entropies enjoy a number of very useful properties that can luckily 
be proved without being obliged to know the optimal decompositions; the first one 
of these is a generalization of Proposition 6.3.4. 


Proposition 8.1.1 Given a C* algebra A, a state w on it and CPU maps y; : Mj œ> 
A, j =1,2,...,n, from finite-dimensional C* algebras M ; (dim M; < d) into A, 


consider a decomposition w = } jm cro) Ai wim and € > 0. Then, there exists a 
PE / . . 
decomposition w = È jeo Nim won where J™ := Jy x Ja X +++ Jn is a multi- 


index set with card( J) depends on d and £ and j@ = fijo-++ jJn jk € Jk, such 
that 


HEDO 2H Ay ua +e 
w im”? i” Mero) = Ww jo j™® {Mer £ 
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Proof Following the steps in the proof of Proposition 8.1.1, consider Palos 
Zj= =1Z, lee , of the state-spaces S(M j) into atoms zi, such that if via are two 


states on M ; belonging to any zi _, then ||v1 — v2|| < 6. The cardinality n ; of each 


of these partitions can be chosen not larger than the smallest number, r, of balls of 
radius < 6 needed to cover each S(M j), a number which depends on dim M ; and 
thus on d. Given the decompositions 


S Anon and w= Mok, k=1,2,...,0 


ier ik 


one constructs the states 


x 
aa ik k 7 X k 
he 2 ' Wip > Aje a Aik 
ik Elk Jk ik€lk 
k ezk wk ezk 
ig jk ig Ik 
yj 
1 S i” y — 4) 
Wy) = y “im: ARO RON 
i) ern) j® i) p(n) 
wk ezk k=1,2,....0 wk ezk k=1,2,. 
ig ik ig jk 


and the corresponding decompositions 


, , _ roy 
J Xj jw and w = J Akie 


JMeTM TkEedk 
Then, introducing the probability distributions A“ := ye and A’ := 
im” iMe]™) 
nN, , together with the respective marginal distributions A, := a 
(n) ik |; 
J megn) k ikElk 


and Aj := [x A , one estimates 
IE} pede 


g= aD y = g= Vo. wt = 
w jm? i” iM ern) wW je j@ Mem = 


= H(A") — H(A’) + (HAD - HAD) 
k=1 


+I s (whe, whe) — D> i, 5 (ohe whe) 
k=1 \ikelk Ike 
<ne. 


Indeed, according to the proof of Proposition 8.1.1, each term in the second sum 
over k is < £, while the first line after the equality is < 0. This can be seen by 
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considering the A, as probability distributions over partitions Px with atoms P$ 
and A”) as a probability distribution over the finite partition P™® := \/7_, Pr. 
Then (compare Remarks 2.4.2), the A), and A’, are probability distributions over 
coarser partitions Qy < Px, respectively Q™ := \/7_, Q < P™, whence, using 
the conditional entropy (2.90), 


H(P™ v QM) = H(P™) = HO) + HP®) 
H (Pr V QY) = H (P) = H(Qk) + H(PrlQx) - 


Thus, from Lemma 2.4.3 and Corollary 2.4.2, 


n 


HP®) — HQP) < Y HPW = D(H Pe) - HQ). 


k=1 k=1 


From Proposition 8.1.1 it follows that for any £ > 0, there exists a finite decom- 
position w = } jo ero) AG Win) Such that 


{yin 
Hy os A woh oer) > Hy (GJE Ypy acs Yn) — NE, (8.8) 


while, from Definition 8.1.1, 


= (n) 
Ho ("1, V2, +++5 Yn) 2 Hy 7 [ren 10 f arena) . 


We shall call these decompositions ¢-optimal and remark that their cardinality r := 
#1) depends on £ and on the maximal dimension d of the finite-dimensional C* 
algebras on which the y; act. Then, the n-CPU map entropies result equicontinuous 
in the maps y; : M j +> A with respect to the topology defined on their linear space 
by the norm 


I = Yl = uP lo — W)C Ilw ; (8.9) 
Xe 


IIXI<1 


where || X||2, := w(X'X) for all X € A. 


Proposition 8.1.2 Let yj and Yp j=1,2,...,n, be CPU maps from finite- 
dimensional C* algebras Mj; with dim Mj < d into A. Then, for any € > 0 there 
can be found 6 > 0 depending on £ and d such that 


< ne 


Ho is ascii Yn) = Hy (Hs Ya- Yh) 


when [Ij = Y llo < ò. 
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Proof Consider a finite decomposition of cardinality r as in (8.8) with £/2 in the 
place of £; then, 


fyi 
Hy (91; V2, ++ ++ Yn) — Hu (Yi Jasi: A) <H,” oe W. io a 
{7} 1 (n 
-1l (pera ogy) +E Elstol) 
ne 
+ Yu, (8(4, 0%) - (4, 0%))) + F 
ijelj 
Choose 0 < 6 < do and ĉo such that the Fannes inequality 5.167 implies 
E 
|s om) — Sam] < = 


when M is a finite-dimensional C* algebra with dim M < d and 1,2 € S(M) are 
states on it such that ||v; — 12|| < ĉo. Since |w(X)| < IX llu (see (5.51)), it follows 
that 


Iwoj-YPll= sup lwo (yj =M) < lylo < 9 < 60, 
MeM,||M\|<1 


whence DG (wo 7;) — $(wo7;)) < ne. 


In T to estimate the remaining sums, let us divide each index set J; into two 
disjoint subsets, namely 


and its complement. Since w = a el; ri, wi, , from (8.9), the state-based distances 


are such that, for alli; € J 


< 00, 


1 
Iy = Yl < Sli — Fille < 
ij A! 4 
V tj 


n 
Í È : 
so that $` > ri, (s (w o yj) -5S (v o %)) <n g Fivally, using that card (7;) < 
j=l ijel? 
card(I™) < r and dim M < d together with (5.165), one derives 


2 


E A(S (e en) -s (4 04) sige 
ijgI? 
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ô 
Therefore, ô < do such that raz logd < 7 yields 
0 


a E 
Ho (V1, Y2. +--+» Yn) 4 Ho i rea L ne + A ne, 


whence the result follows by exchanging the sets {y jar and o; Ja T 


Other properties that are important for applications to concrete quantum dynam- 
ical systems are the following ones. 


Proposition 8.1.3 (Properties of n-CPU Entropies) Given a C* algebra A equipped 
with a state w and n CPU maps Ņi : M j +> A from unital C* algebras Mj, j = 
1,2,...,n, with dim M; < d, into A, it holds that: 
1. the n-CPU map entropies are positive and bounded, 
n n 
0 < Ho (1,7 ---59n) SD Ha (y) <J S(wogj)<nlogd; (8.10) 
j=l j=l 
2. the n-CPU map entropies do not depend on the order of their arguments: 
Hy (M1; V2 +--+ Mm) = Ho (4x1), WO» -> Ya) > (8.11) 


with n(i) any permutation of 1,2, ..., n; 
3. the n-CPU map entropies are not sensitive to repetitions of their arguments: 


Ho (V1; -V= VW Vj V+ -o Ya) = Ho (V1; I sen) 


4. if O : A— A is an automorphism such that w o O = w, then i 
Hy (O 071, O 0 V2,..., O 0 m) = Hu (M1, 725 +++ Yn) ; (8.13) 
5. the n-CPU map entropies are subadditive: 
Ho (V1; <- -> Yp Vpt -> Wa) < Ho (V1: V2» -- -> Yp) 
+H (Yp+1> Yp+2» -+> Yn) ; (8.14) 


6. the n-CPU map entropies are monotonic under composition of CPU maps; namely, 
if ¥; : Nj +> Mj are CPU maps from finite-dimensional C* algebras N ; into 
finite-dimensional C*-algebras M j, j = 1,2,...,n, which are in turn mapped 
into A by CPU maps yj, then 


Hy (Y1 0 H V20 Yao ++ In Oh) < Ho (1,92) - + Yn) ; (8.15) 
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7. the n-CPU map entropies increase by non-trivially increasing the number of their 
arguments: 


Ho (71, Y2, aie y Yn) < Ho (71, Y2, it Yn, Yn+1) * (8.16) 


Before proving the previous properties we examine some simple cases where, in 
analogy with Example 6.3.4, the n-CPU map entropies can explicitly be computed. 


Examples 8.1.1 


1. If in (8.15) one considers as CPU maps Yj and +; the natural embeddings of 
subalgebras Nj C Mj C A, then that property asserts that the n-subalgebra 
entropies increases under embeddings into larger subalgebras. Suppose the finite- 
dimensional C* subalgebras {M iss , to be such that they together generate a 


finite-dimensional subalgebra M™® := V= M ;. This is the case, for instance, 
when each M ; is a spin-algebra at site j on a lattice so that M ™) is the algebra of 


n spins at sites 1, 2, ..., n. Then, M; S M™ whence property (8.15) together 
with property (8.12) and property (8.10) give 


H, (M1, M2, ..., Mn) < Hy (mM, M®,..., mM) 


=H, (m”) <5 (v M”) , (8.17) 


2. Suppose A is an Abelian von Neumann algebra and {A ; F= are finite dimensional 


subalgebras generated by minimal projectors AVA ,- Then, consider the prod- 


ucts& = Alii ++ Gni, I® = x"_,1;, where i =iyi2---in,ij € Ij and 
I; = {1,2,...,d;}. Because of commutativity, these are projectors that one can 


use to decompose w as follows 


w(G;(n)a) 
Ww = > Am Wi (n) š Wi(n) (a) = = Ya E€ A F 
w(G;iny) 


iem 


Further, the various probability distributions and elements of the subdecomposi- 
tions amounts to 
w @; ij a) 


AP = {wea hmr 4 wl a=, A = tole er 
{w( iO bi pn) iC ) w@ji,) j { ( jis dijel; 


It thus follows that (1) the states w]A; = Aj; and the states wİÌA j are pure states 
because the orthogonality of the minimal projectors implies 
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( 


Since the minimal projections a7 generate the Abelian subalgebra A® := 


Vj=1 Aj, this yields 


BA wwf) =— AD oga = 5 (wid) . 


iem 


Because of (8.17) this result is optimal; thus, 
H, (A1, A2, ..., An) =S (oa) (8.18) 


Notice that the latter von Neumann entropy is the Shannon entropy of the random 
variable Vj ;_, Aj distributed with probability A, where the random variables 
Aj are distribated with marginal probabilities A j. The outcomes of these random 
ables correspond to the minimal projections Gji; according to Sect. 5.3.2, via 
the Gelfand transform these can be turned into characteristic functions of the 
atoms of suitable measurable partitions. 

3. The result in (8.18) also holds when the Aj; are commuting Abelian finite- 
dimensional subalgebras of anon-commutative A, but w is the tracial state. Indeed, 
the minimal projectors of the A ; provide an optimal decomposition as in the pre- 
vious example. The reason is that the modular automorphism of the tracial state 
is trivial; thus, (8.6) yields 


Nowo (a) = (Ru loil (my@o)) roa) |2) 


= (Ru | Mw Gna) |) = wGma) . 


4. Suppose {M pis , are finite-dimensional C* subalgebras that generate a finite- 


dimensional subalgebra M™ := =\" j=1 Mj S A. Further, suppose they contain 
pairwise ee Abelian subalgebras Aj; C Mj; each belonging to the cen- 
tralizer of w? and such that the algebra A) = = V; j=1 Áj they generate is max- 
imally Abelian in M™. Since the A j pairwise commute the products of their 
minimal projectors a; i; provide the minimal projectors Gm) of A By assump- 
tion, they are left invariant by the modular automorphism of w and can thus be 
used to decompose w as in the previous two examples. Then, from the second 
example above and from Example 6.3.1.2 one derives 


{M;}j= js 
Hu (M1, Mo,...,Mn) = Hu 7" (Waw), woher) 
=5 (oa) =5 (om) . 
Therefore, the first example yields 


Ho (Mı, M2, ..., Mn) = S (WIM, V M2 V---Mn). (8.19) 


2 They are therefore left pointwise invariant by the modular automorphism of w. 
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1. Positivity comes from choosing not to decompose w at all, in which case the argu- 
ment of the supremum in (8.3) vanishes. Further, the first line in the argument 
of the supremum equals minus the relative entropy (see (2.94)) S$ (A® y AM) 


of the two probability distributions A“ = hal , respectively A® := 


iM eT) 
n j (n) Q; ; 
{IT} 1A;. — on the strings set of strings i’. Since the relative entropy 


is non-negative, the upper bound to the n-CPU map entropies follows from 
Lemma 6.3.1. 

2. In order to show (8.11), letw = Di imepm x kar a) be an €-optimal decomposi- 
tion such that, as in (8.8), 


OSTE BTO) 
Ho i, Ys Cae | Yn) < HA, Cee wit wer) + ne. 


maha 1 (n) -(n) x 
Since Hy Ga aye wro er J where 7(i“"’) denotes the string 


; ; yy : 
in(1)in(2) İnn), equals Hy ae (PR aye} ert J it follows that 
iMe]™ 


{ymin (n) 
Ho (1, Y2... Yn) = < HA, (ho j))? wn wep) a 


< Hy (Yat), WO)» +++ r) +E. 


Equality follows from the arbitrariness of € > 0 by exchange of the sets {y} = 
and r= - 

3. In view of (8.11), to prove (8.12) we show that Hy (71, %2, - --, Yn) does not 
change if the argument 7, appears twice. Consider an £-optimal decomposition 
for the right hand side of (8.12) and, according to (8.5), the corresponding positive 
decomposition of unity in the commutant 7(A)’, {Y ja) } s(n) epi): Then, construct 
a new decomposition 


e Jat+h 
w = 2 X arh jorn 


jetDeJ”+D 


based on a decomposition of unit consisting of 7(A)’ > Y, a+) += Y;a I, where 
JOY := iM j, 41> i™ € I™ and Inti € Jng1 = {1}. Then, 


1 Poi 
Minh = =w(Fjo+) =n, Gy sw Vkén+1, 
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while Ve = 1 and arth = = w. Therefore, 


{ 
Giga EE = oe r )+ne 
(n) iMe](n) 
PPO LX Fjes] 
= = Hy A ARA Y; s(n+1) jerrDe gash + ne 


< Hy (M1, V2, +--+. Yn Yn) + ne. 


In order to invert this inequality, let 


ù (n+ 
Ww = à` ae Wi -(n+1) 


i@tDeznth 


be an €-optimal decomposition for the left hand side of (8.12) and consider 


PMerT@™ 
where j® = jij2--- jn With jy = ig for 1 < k < n — 1, while jy € Jn := In x 
In+1 enumerates the pairs (inin+1) so that IM =h x- In X In, i = 


ee D s(n) = win) If 1 <k <n —1, it also turns out that XE = Ak and mA = 
i : 


J 
wk , while 
k 
ea YO Ge 1 ae 
Nin = À path) 2 i = yn À ne (nt) + 
i(n+1) jn i(n+!) 
ae fixed in,in 41 fixed 


Then, A” := po 1 = A®+D, and Ay := fat | = A; for 1 < 
DJ Meza) tk J ined 
k < n — 1, whereas Ag = {3 |. Finally, one can estimate 
Mika (fxm ~ 
Ho (V1; V2, -s Yn=1; Yn) = Hw (xe oe Wj eae) 
n—1 ; 
= H(A") — Yra, E VN SG oy, wor) 
j=l ijel; 
—H(An) + > Xj, (Gorm, worm). Œ 


InEIn 


From (2.88) it follows that H(A) < H(An) + H(An+1). On the other hand, 
with jn = (in, in+1), 
n _ Yn ~n n+1 Yn ~n 


in Ne Jn wi, in+] \n+l Jn Jn 
In int) intl in 
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and (6.43) imply 
n+1 
E S(r om wom) > DMS (wh om, wor) - 
Jn€In k=n ipely 


Together with (*) this yields 


yin 19 (n+1) 
Ho (71, Y2, a ee | Yn— l, Yn) = > Hy "(as “(n+1)? Wi (n+1) a 


> Hy (1, Y2.---> Yn Yn) — ne. 


4. Property (8.13) is aconsequence ofw o © = w. In fact, given an c-optimal decom- 
position for the right (left) hand side of (8.13) and the corresponding positive 


then {us A U, | 


decomposition of unit in the commutant, i y™ wj ier 
L € 


(n) | iem , 
d Us Y Po ut if wert > where U., is the unitary GNS implementation of ©, pro- 
Men 
vides a decomposition for the left (right) side. 
5. In order to prove subadditivity, let w = J im Ae win) be an €-optimal decom- 


position for Hu (71, 72,.--, Yn) and construct from it the following two decom- 
positions: 


_ 1 oo _ 2 2 
a D D1) jo » W= > Ao W PHD) 


j” k”r+H»D) 
where jP = = iji: € 1) := I, x Inx-++ x Ip, while the indexes 
1 1 
ke- prl) —; ptlip+2° e I" PA a ee a i 
ue 
tgs, i”) te ge (n) 
Yi) = » zl Wi) » Ajo = » Xj) 
keer VD eya—p+h) jp” ket VD ey—p+h) 
yen 
2 ; i”) 2 : (n) 
WW n=p+)) = > ia ’ Ne o—p+h) = > À; i” * 
fj Perr) ke Pt) jP ern) 


Since A! := far ai and 4? := E are marginal 


jP ErP) KPT VD epin—p+)) 


(n) ; 
distributions of A” = Pee 1) 2 (2.88) yields 


H(A™) < H(A!) + H(A”) whence 
Hou (1, Y2s =- Yp) + Ho (Yp+1; Yp+2»- -+> Yn) = 


> aa Nat. + 
pai W j’ j” jer) 
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ty iap+1 
+ Hu (Paa PED Win ey P+!) egn- sai) 


{yi 
ae = A woh ocw) > Hy (1, V2, ---; n) — ne. 


6. Property (8.15) follows from the monotonicity of the relative entropy under CPU 
maps. 

7. Finally, property (8.16) is a consequence of (8.15) and of the fact that, given an £- 
optimal decomposition for the left hand side, one may construct a decomposition 
for the right hand side as in the first part of point 3 above. 


Fropsiien 8.1.4 ([258]) Given a Cx algebra A equipped with a state w, let 
{yj} jal yy, be CPU maps from finite-dimensional C*-algebras {M;i into A, 
then , 


Hy (91; V2; +++ Yn—1s Yn) — Hy (m1, Y2; -- -ə Yn=1> Yn) < Hy (Yn | Y) (8.20) 


Ho Oily) := sup HVP AAi, wihier) (8.21) 
w=} ier NW 
HIP AAi wihier) = JN (s (wiog, wog) 
icl 
-S (wio, woy)). (8.22) 


Proof Let w = $ wero ae a W; be an €-optimal decomposition such that, as 
in (8.8), 


{yj} 
Ho (V1, V2 -e3 Wn) < Ho 7 (PR ajo +ne, 


meres) 


then, according to (8.7), 


Hy (V1: V2, <.. Yn— 1: Yn) — Ho (1.12: cacy Yn=1 Va) < 


{yj} = 1 (n) 
< = 
AL (hg (n)? W, i” i ep) 
n—-1 


U 
NE Afi 1 a win 


= DEAC (uj, em, woy) — S (wh om, wor) aE 


in€ln 


+ ne 
cae 


Since n is fixed and € is arbitrary the result follows. 


510 8 Quantum Dynamical Entropies 


Example 8.1.2 If Ais an Abelian C* algebra and 7,2 are the natural embeddings of 
two finite-dimensional Abelian C* algebras A12 C A, then H, (A; | Az) reduces 
to the conditional entropy of the random variables A, 2 associated with the minimal 
projections {Qihien, h =1,2,...,d, and {a2} jen, h = 2,..., do, 0f Aj,2. These 
projections give rise to probability distributions w]A; = toar ı and wļA2 = 
{w(q Ha and can be considered as the outcomes of A1,2. Accordingly, the set 
of expectations w(@1;a2;) corresponds to the probability distribution of the joined 
random variable A; V A2. Consequently, by using the minimal projections of A, to 
decompose 


wa a) 


vE Ya cA, 


w= X wGiiwi , wila) := 
ich 


a, UA d 
waia j) |? 


it turns out that w; (A1 = {wi (@ik) = Sicha, and wi A2 = | | . Then, 
j=1 


wai) 
by means of (8.22) and of (2.91) one gets 


HA1-42 (fwa), wijen) = H(A1) — H(A2) — H(A1) + H(A1 V A2) 
= H(A, V A2), 


where H(Aj,2):= S lw TA 2) are the Shannon entropies of the random variables 
Aj,2. We now show that no decomposition can do better; indeed, in the Abelian case 
at hands, decompositions of w correspond to partitions of unit with positive elements 
{Ck}eex in A such that 


n= D W(Ck) WE, wla) = He YacA. 


keK 


Finally, Corollary 2.4.1 and strong subadditivity (see Proposition 2.4.1) yield 


H4142 ((w(@&), Wetkex) = H (41) — H(A, V C) — H (42) + H(A2 V C) 
< H(Aı)— H(A, V C)— H(A2)+ H(A V Ad VC) 
< H(A, V A2) — H (A2) = H(Aıļ|A2), 


where C, Aj V C and A2 v C are random variables with probability distributions 
{o (Ck) kek, (W(Ck Mi) lien, kex and {w(Ck @2j)}jeh,keK- 


8.1.1 CNT Entropy Rate and CNT Entropy 


Apart from the relation (8.13), all other properties in Proposition 8.1.3 regard n- 
tuples of arbitrary maps y without reference to the dynamics. Since the purpose 
of the CNT entropy is to quantify the information production in a given quantum 
dynamical triplet (A, ©, w), we set yj := Ojo y, where j = 0, 1,...,n — 1 and 
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y: M | Ais a CPU map from a finite-dimensional C* algebra into A. The first 
step is to ensure the existence of the rate 


1 
lim -Hy (7,9 o 7,..., O"! 07) . 
non 


This limit exists since (8.14) together with (8.13) yield 


Ho (y,@ 07,..., Ol 07) < Hy (7,9 07,..., OP! oq) + 
+H, (OP 07, ©PT! oq,..., O0"! 074) 
=H, (7,0 07,...,0? oy) + Ha (7,0 07,..., OP! o7). 
Thus, one can apply the same argument already used to show the existence of the 


classical entropy rate (3.2) or of the mean von Neumann entropy in quantum spin 
chains: actually, 


1 1 
lim —H, (7,@07,...,0" | o 7) = inf -H, (7,@07,...,0" o 7) . 
n> n nn 

(8.23) 


Definition 8.1.2 Given a quantum dynamical triplet (A, ©, w), where A is a C* or 
a von Neumann algebra and a CPU map y : M + A from a finite-dimensional C* 
algebra M into A, the CNT entropy rate of y is 


1 
hT (6, y) = lim -Hu (q0 o7,..., 07107), (8.24) 


while the CNT entropy of (A, ©, w) is defined by 


hT (@) = suph®™T (Ø, y) . (8.25) 
2 


There are a few properties of the CNT entropy that easily follows from the above 
construction. 


Proposition 8.1.5 The following bounds hold for the CNT entropy rate of a quantum 
dynamical triplet (A, ©, w); given any CPU map y from a finite dimensional C* 
algebra M into A, one has: 


0 < hT (0,7) < Hy (7) (8.26) 
1 
L- hENT (0”, 7) < i (O, y) < i (0", 7) : (8.27) 


w 
n 
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Proof The bounds in (8.26) come from the properties (8.10) and (8.13) of the n-CPU 
map entropies. The upper bound in (8.27) follows from subadditivity (8.14) together 
with property (8.13); indeed, since the limit (8.24) exists, one can fix N 5 n > 0 and 
compute 


1 
iF (O, y) = aos int g Oo re o”k-! ö 7) 


1 1 n—-1 l 
<- lim -Ð H, (0 07, Ot og, ..., ONE-DIi o 7) 
j=0 


1 
— lim H ( O" 07,..., MKD j= Ray 
clin, pH (08% 2.0.8" o) =H (0n) 
For the lower bound, first notice that n-CPU entropies remain unchanged by adding 
to the maps y; in their arguments any number of CPU maps Y; from trivial finite 
dimensional C* algebras {c ll Pi into A. Indeed, for any such CPU map H,, (y ) = 0, 
thus by subadditivity, 


Ho (V V2 es n Vi o Ya) < Hy (1, VN, ---, Yn) s 


However, any optimal decomposition for the right hand side of the previous inequality 
can always be used to decompose w in the left hand side, too. This decomposition 
can then be used to invert the previous inequality; then, one expands @/” o y into 
the set 


ra = {en oy, @intl oy, oe @intn-l o7] f 


where y = y o y’ and ¥’ embeds the trivial subalgebra of M into M. Using (8.15), 
one finally gets 


E (a. O! 07,..., ORD 0 7) = rH (ra, BO tes rey) 


1 
<n—H,(7,007,...,0@ 107) : 
kn 


whence the result follows by taking the limit k — +00. 


As much as for the KS entropy, one needs a means to avoid computing the 
supremum in (8.25). The structure of AF or UHF C* algebras or hyperfinite von 
Neumann algebras resembles that of classical dynamical systems admitting a gener- 
ating partition (see Definition 2.3.5) Indeed, by using the continuity properties of the 
n-CPU entropies discussed in Proposition 8.1.2, one can prove a non-commutative 
counterpart to the Corollary 3.1.1 of the Kolmogorov-Sinai Theorem 3.1.1. 


3 That is algebras consisting only of multiples of an identity operator Il rs 
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Proposition 8.1.6 ([108]) Let (A, ©, w) be a C* quantum dynamical triple which 
admits a sequence of CPU maps Tj : Mj > A and a; : At» Mj from finite- 
dimensional C* algebras with identity into A and back such that lim j_, +09 \|Tj © 
oj[A] — All = 0 forall A € A. Then 


CNT : CNT 
hs (0) = lim hy (O77) 


Proof Let y : M > A be any CPU map from a finite-dimensional C* algebra M 
into A; set yj := Tj o aj o y. Then, yj(M) —> y(M) in norm forall M € M whence 
lly; — yll + 0 when j —> +00 for M is finite-dimensional. The same is true for the 
CPUmaps OF o yj and OF o y,k > 0. Since ||O* o (yj — Yllw < 1O* o (7j = VIL 
Proposition 8.1.2 yields 


“lH (7,0 07;,--.,0"!07;) — Hy (7,007,...,0" o 7)| <e 


for any £ > 0 and j sufficiently large, whence 


lim h (0, 7j) =h (0,7). 


j—>+%œ 
Now, using the monotonicity property (8.15), it turns out that 


bE (0,7) = re inf he (8.74) 2 Tai m n T (8, 7;) 
< lim suph™" (Ø, 7;) < hT (0) . 
j—+2 


The result thus follows by taking the supremum over y. 


In case A is an AF or a UHF C* algebra, namely the norm completion of an 
increasing sequence of finite-dimensional C* subalgebras M,, C A or matrix alge- 
bras Mn; (C) (see Remark 7.4.1), one chooses as CPU maps ø; the correspond- 
ing conditional expectations and, as the CPU maps 7;, the natural embeddings 
lM, | My, +> A[108, 109,274,275]. 

A similar result as in Proposition 8.1.6 holds for von Neumann quantum dynamical 
systems with A a hyperfinite von Neumann algebra (for the proof see [108,264]). 


Proposition 8.1.7 Let (A, ©, w) be a von Neumann dynamical triple, with A hyper- 
finite and generated by an increasing sequence of finite-dimensional von Neumann 
subalgebras {Mj.}xen; then, 


h®T (0) = : Jim hT (0, My) . 
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Remark 8.1.2 When a quantum dynamical system under has the algebraic structure 
as in the above proposition, then the continuity properties of the CNT entropy allows 
to turn the lower bound in (8.27) into an equality [108], namely 


HSN? (@") = jah (0) Ynez. 


Moreover, this result can be extended to a one-parameter group of automorphisms 
{@;}rer, that is [256,264] 


hT (@,) = hT (0) vreR, 


where © := O;=1. 


8.1.2 CNT Entropy: Quasi-local Algebras 


As seen in Sect. 7.4, in quantum statistical mechanics one often considers quasi-local 
algebras A which are generated (inductive limit) by local algebras Ay, indexed by 
finite volumes V, that are not finite dimensional. For instance, this is the case with 
Bosons in R? or with a lattice Z? with infinite dimensional Hilbert spaces at its sites; 
in such a setting one cannot resort to either of the preceding two propositions to 
compute the CNT entropy (8.25). 

Nevertheless, a quantum Kolmogorov-Sinai-like theorem holds under the fol- 
lowing physically plausible assumptions [275]; we shall consider dynamical triples 
(A, O, w) consisting of 


1. a quasi-local C*-algebra A which is the norm completion of | Jy Ay where the 
local algebras Ay associated with finite volumes V C R? share a same identity 
and are isomorphic to the von Neumann algebras B(Hy )* ; 

2. if V C V’, set V” := V’\ V, then Hy = Hy 8 Hy, Ay = Ay ® Ayr; 

3. the ©-invariant state w is locally normal, that is w}Ay is a density matrix py € 

1 (Hy). 


Let ¿y denote the embedding of Ay into A; as shown in [274,275], using the 
second assumption one can construct a family of CPU conditional expectations oy : 
At Ay such that |y o cy[A] — Al] — 0 for all A € A when V + R?. Consider 
a CPU map y : M + A where M is a finite-dimensional unital C* algebra and set 
qv := ty o oy o Y. Now, limy yp; Il yv — Il = 0 for M is finite dimensional whence 
Proposition 8.1.6 yields 


lim, hT (0, yw) =h (0,7). 


4 In the following we shall restrict to RÌ for simplicity; the result holds in general for R? and Zf, 
d > 1 [275]. 
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From (8.25) and the previous equality one derives 
het (©) = sup lim no’ (©, yy) < lim sup sup ht (©, 7v) 
y VIR VIR 7 


<limsup sup hT (@, 1y o Ay) < hT (0), 
VtR3 Ay:Mr Ay 


where the second inequality holds for not all CPU maps Ay : M+ Ay are of the 
form of yy. Thus, 


h®T(@)= lim sup h® (0, yy) . (8.28) 
VtR q:iMe Ay 


Fix a volume V C R? with local density matrix py = a aA ri, yri, | where 


the eigenvalues r are repeated according to their multiplicities and decreasingly 
k k i i k k 
ordered. Let pe = Ja lry ryb oP := 1 — PP and 


k). p% (k) (k) 
Ay :=Py Av Py ®CQ,’. 
The latter is a finite von Neumann subalgebra of Ay; consider the map 


w(Qy’ AQV) 


(k) (k) (k) 
oy [A] := Py’ APY + x 
w(Q) 


OW WAEAy. 


It linearly maps Ay into AY is unital and positive; further, if A € Ay and AY > 
B= PË Z P + cg OW, withcg € Cand Z € Ay, 


k 
wP AQ®) w 


k k k k k 
oy [A B] = Py A PY? Z PY? + cs o VT oy [ALB 
V 


Therefore, according to Definition 5.2.5, o% ) : Åy > Ay is a conditional expec- 


tation; then, for any CPU map yy : M œ> Ay set Ty := ty o yy and rT := 
ly © De o a o y, where A is the embedding of A into Ay. We now show 


that, for any yy : Mth Ay, 
hgNT(@, rv) = lim B~T (0,7?) . (8.29) 


This lead to the result that, in order to compute the CNT entropy, one must 
essentially compute the CNT entropy rates of the finite-dimensional subalgebras 
projected out by the spectral projections of local states. 
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Theorem 8.1.1 ([275]) Under the assumptions 1 — 3 on (A, ©, w), 


CNT _ yy : CNT (k) 
WNT (©) = lim, lim nf (0,4 Ve 


Proof Writing hCN (o, AY) = not Q ivo W) and using (8.29) and (8.15) 
one gets 


im WNT (o, AP) < sup WONT | 0, wy ow 
T yv:Mr> Ay ——— 
TV 


= sup lim Ky O,ıy o i o o™ o yy 
aa 


(k) 
Ty 


< lim WNT (0, ty ou) = lim nS (0, AP’) . 


~~ k>+o +00 


Therefore, the result follows from (8.28) and the above estimates which yield 


sup hes (O,ty oyv) = lim hoN! (o, Ay) p 
yy:Mm> Ay k—> +00 


Proof of (8.29) We argue as in the proof of Proposition 8.1.2; we thus set y; := 
OÏ oy, Yj = 0l o eu and estimate the norms 


w(X'OF oty[M]) (X04 o 7 TM) 
w(X!) w(X;) 


> 6) 


where a short hand notation for (8.5) has been used and the X : are positive elements 
of the commutant Ty ( A)’ such that `; X; = 1. 
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By writing Ay > X = (PP + 0)x(P + 0%), and setting v;(M) := 
w(X/@/ o ry[M]) and v” (M) := w(X/@/ o TP [M)), one finds 


vi(M) — v (M) = (Oix vim OW) 
E is! 


A 
+w(@F[X]PM vim] OP) + wox. yviM] PO) 
— mamama ee) eee 
B C 
w(OyvIM]0") 


—w(@~[x7]0"”) 


~ 


w(Q) 


D 


Since 0 < O7/ [X/] =: Z € Tu(A)' commutes with the projections Y = PP, o®, 


one can write ZY = /ZYJ/Z = 4 ZYX ZY; thus, using the Cauchy-Schwartz 
inequality (5.51), one estimates 


AR < w(ZOY) ozo Pym aiM.) 
< w(X}) (ZOP ly I M]? 
2 < w(Z PY) w(ZOY yIM IPP ywIiMlOy) 
< W(X) (ZOP yv IMI? 
2 < w(ZPP)o(ZO G WIMPY wM) 
< W(X) o(ZP) yv l? IMI? 
ID? < w(X}) oZy) lv I? M1? 


w& 


a 


where (5.34) and Y < 1 => X YX < X' X have been repeatedly used; further, 
in the expression C, Z has been transferred to the right side of w(-) before applying 
(5.51). 

Since `; X; = 1, these estimates obtain 


2 al - i — VP < 16l J uox p) = 6l X ri 
~ 7 j=k+1 
<e 


for and € > 0 and k sufficiently large. Notice that the summands are w(X;) times the 
squares of the norms (:«); we can now distinguish between the set J of those i’s such 
that the norms (x) < e!/9 and the rest Z°. Then, 


2 


> eS" wX) 


iele 


(k) 
“i—i 


vE?) 


JeF ogn” 


iel¢ 
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implies that the total weight of Z° is smaller than e!/?. As in the proof of Proposi- 
tion 8.1.2, this can be used to show that, for any 7 > 0, 


lH. (Tv, Ooty,...,0" 10 Tv) -Ho ‘Gy Oo ae aia” ~ 7) | <n 


for k sufficiently large. 


8.1.3 CNT Entropy: Stationary Couplings 


In this section we reconsider the expressions (8.21) and (8.22) in the following 
algebraic setting: 


e avon Neumann algebra A with a normal state w; 

e an Abelian von Neumann algebra 6 with a normal state wy i 

e the tensor product von Neumann algebra A ® B equipped with a normal state © 
such that its marginal states are W}A = w and wB = w,. 


Let A C A and B C B be finite-dimensional C* subalgebras; as CPU maps 71, 
respectively y2 in (8.21) we shall take the embeddings y; = ug, respectively y2 = tA 
of B, respectively A into A &® B. We shall focus upon the quantity Hg (B | A): one 
has the following result [252]. 


Proposition 8.1.8 Let B be the random variable corresponding to the subalgebra B 
and H,,(B) denote the Shannon entropy corresponding to the von Neumann entropy 
of the state w, restricted to B. Then, 


Hz (B | A) = H, (B) — S (w 9 w, lA 8 B , DJA Q B) . (8.30) 


Proof Given the minimal projections lje , Ig = {1,2,...,d} of the finite 

dimensional Abelian algebra B C B and a convex decomposition © = } jez AiG, 

one can construct a finer decomposition of the form © = 5 Aj HijWij, by further 
icl; jelg 

decomposing ©; = }_ rae [4ijW;ij;, Where the states W;; on A Q B and their weights 

Lij are defined by 


= wila Q b;b) ae 
Dija gb) = E, wij =b)  (&) 
&; (bj) 


5 According to Sect. 5.3.2, the state Wwy corresponds to integration with respect to a suitable measure 
js and measure space ¥. 
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for alla € A and b € B. Since for alli € I 


~ Di (bjby) 
wi; (bk) = =— = Î;k, 
ij( k) AO jk 


the restrictions w; j lB are probability distributions A;; = {ő jk}ke Ig With zero Shan- 
non entropy. Then, using (8.22) and (6.35), one computes 
B,A ~ 
He” ({Aimij, Dijhier,jetg) = Ha (B) — S (w}A) 
+ 5 Ai Mig S (Ñi 1A) =) 
jeIB 
B,A ~ 
H> MA, Wilier) = H, (B) — S (wM) 
+ D a(S Gil) - S &1B)) - 
iel 
It thus turns out that 
HZ ^ (imij, Dijhier jers) — H54 (i. Dihi) =- JO mij log pij 
icl; jelg 
-J AS ÕI + JO Dimi S (yA) = 0. 
iel icl; jelg 


Indeed, since p; j Wij < Wi, the monotonicity of f(x) = log x as an operator function 
(see Example 5.2.3.9) gives 


> HijWij lA log uij JA = > ij Wij 1A(log (ni Wij 1A) — log Hj) 


Jéelp jelg 
< J niyjŭylA(logõ:}A — log mi) 
jelsB 
= GjJAlogGi1A — J mij Õij]A log pij . 
jelB 


Therefore, after multiplying by the weights 4; and summing over i € 7, by taking 
the trace and considering that the marginal state w;; JA has trace 1, one finally gets 


— > Xi ip S (ij 1A) <- 5 Ai S (i lA) — > Ài Hij log Hij - 


icl; jelg iel icl; jelg 


One thus concludes that in order to compute Hg (B | A) one can start with decom- 
positions of © in terms of states of the form (x). Then, consider the decomposition 
© = } jer, Vj), Where 


ala ® bb) 


cisa , v= Bb) acA, bEeB. 
(bj) 4 i 


Č; (a 8b) := 
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Since S$ (© j 1B) = 0, this decomposition contributes with 


HE4 (Aj, Bj hier) = Hu(B) — S WA) + J vjs (jA) (eH). 


jelg 


Further, notice that the decomposition appearing in equation (**) is such that 


~ AÀihHij ~ x 
X Atij = vj = w, @;), > i ij = Gj. 


icl ie] J 


Therefore, by the concavity of the von Neumann entropy (see (5.166)) 
> Ai pig S (ij lA) < > vj S (jA), 
jélp jelp 


whence decompositions of the form (* * x) are optimal. The proof is finally com- 
pleted by calculating 


S (v8 wlA @ B, DJA 8 B) = 
Tr(wia 9 wu ]B(logw]A @ w,1B — log ©]A @ B)) = 


= —H,(B) — S (WIA) + È vj Tr(3;1A log (vj 1A) = 


jelg 


= $ vj S (A) — S@IA). 


jEeIB 


The previous considerations are useful in a different approach to the CNT entropy 
which was developed in [311] (see also [264]). We shall refer to the formulation used 
in [14] 


Definition 8.1.3 Let (A, ©, w) be a dynamical triple with A a hyperfinite von Neu- 
mann algebra and w a normal @-invariant state; a stationary coupling to a commu- 
tative dynamical triple (6, 0, w,,) where B is an Abelian von Neumann algebra, is 
any triplet of the form (A @ B, © ® 6, ©) where © is a © @ O-invariant state such 
that WM = w and wB = w,. 


The quantity Hz (B | A) and its expression (8.30) can be generalized as fol- 
lows [252]. For any finite dimensional subalgebra B C B let 


Hz (B | A) := sup par (s (|B, wB) — S (iM, GIA) (8.31) 


W=d;j 0; 


= S(w,IB ) — S (ĞU 8 B, wQ w,M8 B) . (8.32) 


8.1 CNT Entropy: Decompositions of States 521 
It then turns out [252,264,311] that 


hT (©) = sup ie (0, B) — Hs (B| A}. (8.33) 
BBO. | 


where the supremum is computed over all possible stationary couplings and all finite- 
dimensional subalgebras B C B. Notice also that in the expression of the KS entropy, 
in according with Sect. 5.3.2, we have kept the algebraic notation whereby B stands 
for a partition of a phase-space æ, the automorphism 6 for a measurable, invertible 
dynamical map T : ¥ +> & and the state w, for a T-invariant measure p. 


Remark 8.1.3 The relative entropy as it has been used so far has always involved 
density matrices or restrictions of states to finite dimensional subalgebra, whereas 
in (8.31), because of the presence of the generic von Neumann algebra A, it appar- 
ently works in a more general context. It turns out that the expression (6.35) for 
the relative entropy has a generalization to any unital C* algebra A and to generic 
positive, linear functionals (not even normalized) w1,2 on it [108,264,353]: 


ape 


- wi(y()" yO) — Sere x] 
r Eiz 19 y 7 v2 , 


S (w1, w2) = sup f 
0 


where y(t) = Il — x(t) and t+ x(t) € A is any step function with values in A 
vanishing in a neighborhood of t = 0. 


8.1.4 CNT Entropy: Applications 


We start the presentation of various concrete applications of the CNT entropy by 
showing that in a commutative context it reduces to the KS entropy. 

Consider a classical dynamical system (7, T, u) that possesses a generating 
partition P = PE (see Definition 2.3.5) and its corresponding von Neumann 
triplet (M := Lr (X), Or, wu) (compare Definition 2.2.4). In this framework, the 
partition P is identified with the finite-dimensional subalgebra Mp generated by 
the characteristic functions yp, of the atoms P; of P. Furthermore, the partitions 


P; = ae T~J(P) (which generate the X-algebra of X when k — +00) cor- 


respond to the Abelian finite-dimensional subalgebras Mg := Vix k oj (Mp) gen- 
erated by the characteristic functions of the atoms of pt p (these subalgebras generate 
M). We can thus apply the argument of Example 8.1.1.2 to deduce that 


Ho, (m. Or(My,..., of (Mx) =$ (oume) ag (vum eD) 


= Ape’) : 
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where we used that w, is O7-invariant and that 


AA l n+k—1 n+2k—1 , 
uM” = V @ (My) = V Oİ(M) = oz*( V oim) , 
e=0 (=k j=0 


together with (3.1). Thus, from Theorem 3.1.1, 


hT (6, M;) = h (T,P)= ie (T). 


Wy 


Then, aN (Or) = be (T) follows from Proposition 8.1.7. 


8.1.4.1 CNT Entropy: Finite Quantum Systems 

For finite-dimensional quantum dynamical systems, the C* algebra A is a matrix 
algebra Ma(C) and the states density matrices p € Mg(C) with von Neumann 
entropy always bounded from above by logd. All these systems cannot support 
a non-zero CNT entropy rate, in agreement with the fact that their dynamics, given 
by a unitary U € M,(C), is quasi-periodic and shows the behavior discussed in 
Remark 7.4.7, at the most. We shall prove this by considering the slightly more 
general scenario studied in [54]. 


Proposition 8.1.9 Let (A, Oc, w) be a quantum dynamical system with A = B(H), 
w corresponding to a density matrix p € B i (H) with finite von Neumann entropy 
S (p) and invariant under the automorphism © such that 


(ŒD > X e O[X] = į" X e`” , 


where the Hamiltonian H has a discrete spectrum. Then, ho (0) = 0. 


Proof Let P™ be the projector onto the subspace of H spanned by the eigenvectors 
relative to the first n decreasingly ordered eigenvalues of H and Q™ := 1 — P™, 
Then A is generated as a von Neumann algebra by the increasing sequence of sub- 
algebras A™ := P™ A P™ @C Q™. These subalgebras are @-invariant; also, 
they diagonalize p for it commutes with H since w o © = w. Then, from Proposi- 
tion 8.1.7, (8.12) and (8.10) it follows that 


1 
BENT (@) = Tim bONT (o, Ae) = lim +H, (A, AM, AW) 


n>+0 n 


1 1 
i, in (4) = im, Es (1a) =o. 


n>+œ n 
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8.1.4.2 CNT Entropy: Quantum Spin Chains 

The algebraic structure of a quantum spin chain (Az, @,, w) is such that we can 
apply Proposition 8.1.6. Let then {A;—¢,c}}z¢en be an increasing sequence of finite- 
dimensional local subalgebras that generate Az, then 


hg (Oo) = lim BONY (Oo, Aje.) = im BONY (Oo, Ati.cr) > 


where the second equality follows from the translation invariance of w and the 
property (8.13). Further, since OZ [Ajy g] = An+j.e+j] C Atjetn—1), for0 < j < 
n — 1, using (8.15), (8.12) and (8.10) one derives 


Ho (Aua, Oo (Ap,a), <- O21 (At) < Ho (Apern—1y -<> AD, t4n-1]) 
= Hy (Au,e+n-1]) < S (Pt,e+n-1]) > 


where pr1,¢+n—1] 1s the local density matrix corresponding to the state w restricted 
to the local subalgebra Aj1,¢+n—1]. By using (8.24) and Example 7.5.1.1 one thus 
concludes with 


Proposition 8.1.10 hSN™ (@,) < s(w) for any (Az, Os, w). 


In order to show whether and when hs (Os) > s(w), we use the following 
strategy. Consider a local subalgebra Ay) e} with fixed £ > 1 and set n = k£ + p, 
0 < p < £in (8.24); using Definition 8.1.2 and property (8.16) we get the following 
lower bound: 


hg’ (09) = lim WENT (Os, Ani.ey) 


Ho (Ana. Os (Ag), +++: orl Ang) 


> lim 
n—> oo A +p 


> lim EE: (A nep OF(Ap.q)s «+ O&P(An.g)) 
k>o0 k 
Jim z 7 w (A, [1,6] Apet1,2¢], ++» Are e+ kel) 
. 1 {Apjerigeng} ch 
> dim pe j (Aw, ww, }) 5 (8.34) 


where we used (8.7) with w = Xw Aj w(x) any chosen decomposition adapted to 
the k commuting local subalgebras Ay j¢41,(j+1)¢]- 


8.1.4.3 CNT Entropy: FCS States 

If a quantum spin chain is endowed with a finitely correlated state w as defined 
in Sect. 7.4.5, then its CNT entropy coincides with the entropy density s(w) (see 
Sect. 7.5) [161]. 
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Proposition 8.1.11 Let (Az, Os, w) be a quantum spin chain with a FCS w, then 
hoNT (@,) = sw). 


Proof Because of Proposition 8.1.10, the result follows if we show that i (O,) = 
s(w); for this we use the lower bound (8.34) and Remark 7.4.15. Therefore, we fix a 
local subalgebra Ajj ,¢}; since altogether the arguments of the k-subalgebra entropy 
in (8.34) generate the local subalgebra Ay) ke] we need consider the local state pr1,x¢}. 
We start from a decomposition as in (7.101) with n = ké and regroup the indices 
j@ as follows 
(kO. a i : 2 e : ; 
J = J1J2: +t Je JA11 JE+2 tt JQ Wk E+VI K-12 °° Jke -> 
No a a e e -_————— 


iy i2 ik 


Notice that the index j of the Kraus operators in the CPU map E defining the 
FCS w runs over a finite set J, whence j «O eI is while each i, in the regrouped 


index i® = iji2---ix belongs to the index set jhe thus i € I. We have seen in 


J 
Example 7.4.15 that the weights p( j5 assigned to the indices j (ko) give rise to 


a compatible family of local probability distributions 7“ and that these define a 


global shift-invariant state w, over the classical spin chain a We then construct 
y : i(k) 
the decomposition w = } jw Awww, Where A;w := pli ®©) and wiw = Pr ke} 


and use it to compute 


Arta kol k . 
ut alia (DCT }) = > n(pi™) = >» > n(p! (ij) ae 


Me, j=l ijel 
J 
k . 
+ 2S (wlArg—veriia) — > YS PGS (oi Mig-derija) ; 
j=l j=lijer! 


where, with the notation of Example 7.4.15, 


GA -(k) i. pu iM _ Üj 
p’ (ij) = x PUN), w; = > mi) Pike = Pia: 
iM ek, ierk J 
rf I£ 
J J 
ij fixed ij fixed 


From translation invariance of FCS it follows that w}Ai(j—1)e+1, je] = Pu, wit 


+(€) f 
Al(j—e+1, je] = Phe and p’ (ij) = pi) for some je E€ I£. Therefore, (8.34) 
reads 
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1 1 
Thea (Oz) > slim | 7 5 noG®y) _ z 5 np(j)) 


e(k) - yk sjel 
i Ely peel; 


1 1 “( a) 
+ 75 (Pua) -7 >» pis » s (oha) ` 
jOE 
In the limit £ — ox, the second term in the first line gives the Shannon entropy rate 
of the classical spin chain oO, Os, wr) as well as the limit k —> oo in the first 


term, while the first contribution in the second line gives the von Neumann entropy 
density of (Az, O,, w). Thus, 


1 7 (£) 

CNT i > ia i 

W to 2 sw) = fim y 2 Pit DS (oa) - 
jOal; 


The proof is then completed by using that, as discussed in Example (7.4.15), 
S (o) < 2log£. 


8.1.4.4 CNT Entropy: Price-Powers Shifts 
The non-commutative shifts discussed in Sect. 7.4.5 offer an interesting variety of 
behaviors of the CNT entropy [14]. 

By construction the quasi-local algebra A, is generated by the Abelian algebra A 
Tel 


consisting of the orthogonal projections and by its images A, := O} (A1). 


These are also Abelian subalgebras, but in general they do not commute with each 
other; moreover, denoting by Mx the subalgebra generated by Ay, € = 1,2, ..., k, 
these generate the von Neumann algebra Mg in Example 7.4.17.1. Therefore, one 
can compute hent (Or) by means of Proposition 8.1.7. Notice that, because of 
(7.122), the Mx can be represented as subalgebras of the spin algebras M2 (C)®*; 
this fact allows to derive a bitstream-independent upper bound to the CNT entropy. 
Indeed, by using the properties in Proposition 8.1.3, one estimates 


Hu (Mk, @o(Mx),..., OTMK) < Ho (Mn+k-1) 
<H, (mer) =(n+k—1) log2 whence 
hENT (Os, Mx) < log2 => hENT (@,) < log2. 


We discuss a few particular cases, a thorough analysis of the dependence of the CNT 
entropy on the bitstream being provided by [264]. 


1. For a bitstream g = 0, (m g Oo» w) amounts to a classical Bernoulli shift and 


hi, (Oz) = log2. (8.35) 
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2. If g(n) = 1foralln > 1, then Mg, describes a Fermi system on a one-dimensional 
lattice at infinite temperature (see Example 7.4.17.3), where pairs e2;, e2;+1 give 
rise to annihilation and creation operators aj, aj fulfilling the CAR. Since n such 
operators generate an algebra isomorphic to Mzn» (C), an argument similar to the 
one that gave the universal upper log 2 yields 


Hu (Mx, Oo (Mg), ..., O"THM p) < Ho (Mn+k-1) 
k-1 
< Hy (Malet) = a log2 whence 


log2 


2 
EL —» hT (@,) < =, 


lo 
hT (@,, Mx) < a 


where we have set [j/2] = j/2 for j even and [j/2] = (j + 1)/2 for j odd. On 
the other hand, (7.121) implies 


[e2j—1€2j , e2x-1€2k] = (e2j-1€2j) (e2k-1€2x) X 
X (1 _ eee Gen) aiff): 


for all j, k > 1; therefore, operators of the form e2j—1e2; commute. Let A denote 


the Abelian algebra generated by e; e2 (which is isomorphic to the diagonal matrix 
algebra D2(C)); then, the Abelian algebras (ož (4) commute and generate 
an Abelian algebra isomorphic to the diagonal matrix algebra D2» (C). Then, 
using Example 8.1.1.1, one gets 


H, (A, @2(A),..., o2"-(A)) = n log2. 


Finally, (8.27) yields 


1 log 2 
BENT (05) = KET (0, A) = zho (02, A) = = , 
whence 
CNT log2 
hy (0o) = ~ (8.36) 


3. Inthe case of an asymptotically highly anti-commutative Price-Powers shift [262], 
for any W; € Ag there exists an infinite set J (i) of integers such that 


[osiw], extwil| = Wi, 


Wi+m + Wi+m When = 0 ’ 
for alln,m € I(i). We shall show that 


hT (6,) =0. (8.37) 


8.1 CNT Entropy: Decompositions of States 527 


In order to do that, we shall consider a stationary coupling of (Mg, Os, w) to 
commutative dynamical triple (B, 0, w,,) (see Definition 8.1.3 and the preceding 
discussion), namely a triplet of the form (A @ B, © @ 6, ©) where W isa Os @ 0- 
invariant state such that WA = w and WB = w,,. Let p € B be any projection; 
then, 


[Win @ 6"), Wi @O"Lpl} = [win Wi} @ [ple Lp] = 0 


for all n,m € I (i). As done in Example 7.4.17.5, by setting 


1 N 
X = — Wi 
N. y Wi 8&8 p 
i=l:n;EI (i) 


for an arbitrary N € N, one estimates 


Ew ® p)| =w] < = 


Since N is arbitrary, we deduce that ©(W; ® p) =0 for all W; € M g and 
all p € B which means that the global state factorizes: © = w Q wp. This fact 
in turn implies that the relative entropy contributions in (8.32) vanish so that 
Hz (B | A) = S (w, ÌB) for all finite-dimensional subalgebras B C B. Since 
hSS (0, wu) < S (wy 1B), the result follows from (8.33). 


8.1.4.5 CNT Entropy: Quasi-free Bosons and Fermions 
For quasi-local algebras of Bosons and Fermions in translation-invariant quasi-free 
states as those considered in Example 7.5.1.3, one can consider the discrete space- 
translation group := {On},<¢73 (see Example 7.4.2.1) and enlarge the scope of Def- 
inition (8.1.2) to cover the fact that there are now three directions along which the 
n-CPU entropy (8.3) can increase. 

Given a CPU map y : M œ> A®-¥ from a finite-dimensional unital C* algebra 
into the Fermi, respectively Bose quasi-local algebra, a natural way to proceed [276] 
is to consider, for each k = (k1, k2, k3) € N°, the parallelepipeds 


Bik) := [n en’ -O<n<k, b12,3) 
CPU maps of the form ©, o y and then to replace (8.24) by 


CNT 
h; 


a 6V := Ho, ({On © Vine) - (8.38) 


lim — 
kı,k2,k3—> +00 ky kok3 


while keeping the definition of (8.25) for the CNT entropy of . Notice that the limit 
in the right hand side of (8.38) exists because of the subadditivity property (8.14) 
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and the assumed translation-invariance of the quasi-free state w4 together with prop- 
erty (8.13). 

With the same technical assumptions ensuring the result in Example 7.5.1.3, by 
means of (8.1.1) it can be showed that the CNT entropy of the space-translations 
coincides with the mean entropy [276]: 


us O= Fa [ _ dk (RAW) + n0 — Rak) ) (Fermions) 
1 P m 
haa O= F f dk (nRa) — n+ Ra())) (Bosons). 


Remarks 8.1.4 


1. Quasi-free automorphisms in the Fermionic case have been considered in [256, 
341]; the following result holds [265]: let f, g € Lip on (dp ) be single particle 
wave-functions for a Fermi system on a lattice ([0, 27] being the momentum 
space). Consider 


Oy (a*(f)) =a*(Uf), (Uf)(p) =e’ F(p); 


it defines a quasi-free automorphism over the CAR algebra with single particle 
energy w(p) assumed to be a real absolutely continuous function of the momentum 
variable p. Further, let 


Qn 
w(a( f)at (g)) = [ dp p(p) f*(p) (p) 


define a quasi-free Oy -invariant state over the system, with 0 < p(p) < 1 a mea- 
surable one-particle distribution over [0, 27]. Then, 


2r 
hT (Oy) = [ dp W (DI MEP) +n — p(p))) , 


where w’(p) := dw(p)/dp is the group velocity. The physical interpretation is 
suggestive [256]: for a quasi-free automorphism the dynamical entropy production 
as described by the CNT entropy amounts to a flux of single particle Fermionic 
entropy governed by the group velocity. 

2. While in one-dimensional quantum dynamical systems the | /n factor controls the 
asymptotic increase of the n-CPU map entropies, this is no longer true in higher 
dimension. An instance of this fact is the previous result where one divides by 
volumes in order to avoid divergences due to the freedom to move in more than one 
direction. In general, that is in the case of the time-evolution in dimension >2, it 
turns out that host (©) = +00; this problem arises already on the classical level 
and a possible way out is to consider space and time translation together [54, 184]. 
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8.1.5 Entropic Quantum K-Systems 


In Sect.3.1.2 it was proved that the algebraic structure of classical Kolmogorov 
systems introduced in Sect.2.3.1 can be characterized by means of the dynamical 
entropy rate. In particular, from the proof of Theorem 3.1.3 it emerges that the equiv- 
alence between the existence of a K-partition, namely property (1) in the theorem, 
and the entropic properties (3) — (6) hinges upon property (2) that is the triviality 
of the tail of all finite-dimensional partitions. 

Algebraic quantum Ķ -systems have been introduced in Sect. 7.4.4 as generaliza- 
tions of classical K-systems; in this section, we present an entropic characterization 
of non-commutative K -systems that partially mimics that given in Theorem 3.1.3. 
This gives rise to a class of quantum dynamical systems with particular clustering 
properties, but in general not K-systems from the algebraic point of view. 

We start by considering the relations (3)—(6) in the above mentioned theorem 
and study how they are affected if one substitutes the n-subalgebra entropies for the 
Shannon entropies. For sake of simplicity, we shall restrict to the case of AF algebras 
A (see Remark 7.4.1) so that we can consider finite-dimensional subalgebras A C A 
as arguments of n-subalgebra entropies, namely, we take the natural embeddings 
LA : A |> Aas CPU maps y. Also, we shall restrict to faithful states w so that the 
only subalgebra with 0 entropy with respect to w is the trivial one (see Lemma 6.3.1). 


Theorem 8.1.2 Given a quantum dynamical triple (A, ©, w) with Aan AF C* alge- 
bra and w a faithful state, let {cl} C A denote the trivial subalgebra and consider 
the following statements 


1. the CNT entropy is strictly positive, namely for all non-trivial finite-dimensional 
subalgebras AD A # {cll}, 


h®T (@, A) > 0; (8.39) 
2. for all finite-dimensional subalgebras A D A # {c1}, 


lim hg? (@”, A) = H,, (A) ; (8.40) 


n—> +00 


3. for all finite-dimensional subalgebras A D A # {cll}, A D B and all sequences 
{ixtken of positive integers, 


lim lim inf | Hu (B, OTA (A), ..., eo" (A)) 


n—>+00 k->+00 


-H, (B, @"til(A),..., oe" (A) ] =H, (B): (8.41) 
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4. for all finite-dimensional subalgebras A D A # {cll} and all sequences { jk}keN 
of positive integers 


lim lim inf [H (B, @"ti(A),..., OPT) (A)) 


n—>+00 k— +00 
=H, (B, O”TİI(A), ..., nb (4))| =0 = B = {cl}. (8.42) 
They stand in the following relations 


(8.40) => (8.39) 
(8.41) => (8.42) | 


Proof (8.40) => (8.39) Because of the second property in Lemma 6.3.1, one can 
choose 0 < £ < H, (A) and n such that, using the lower bound in (8.27), 


5 Ss 


n 


1 
hT (Ø, A) > —hSNT (0", A) >0. 
n 


(8.41) => (8.40) Consider the sequence { j, = (k — 1)n}k>1, from the assumption 
it follows that for any £ > 0 there exists no € N such that, for all n > no, 


limin [H. (4, @"(A), O"*"(A),..., o*a) = 
—> +00 


-Hv (o"a), O(A), @3"(A),..., e"(A)) | = 


2 lim inf[ H. (A, @"(A),..., o*a) =H, (A, @"(A),..., ena) ] 


—> +00 


> H, (A) —«, 


where property (8.13) has been used in the first equality. Further, since lim inf = 
—> +00 


sup inf it follows that there exists po € N such that, for all p > po, 
p>0k=p 


Ap := Hu (A, @"(A),...,0"(A)) — Ho (A, @"(A),..., enr-0(A)) 
> limin [Hu (A, @"(A), saa) 
—> +00 


-Ho (A, @"(A),..., e"™(A))] —¢>H,(A)—2e. 


Choosing p > po, one thus estimates 


! z H, (A) 
—H, (A, O"(A ee OUP 1) A)| = = y 
(4, o"a) a) =i + 
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Since the left hand side of the first inequality is always smaller than H, (A) 
(see (8.26)), the result follows from the arbitrariness of € > 0 by letting p > +00. 
Equation (8.41) => (8.42) This follows from property 2 in Lemma 6.3.1. 

Equation (8.42) => (8.39) When A Æ {cll}, then, the same argument used 
to show that (8.41) ==> (8.40) implies that, for any £ and sufficiently large n, 
hSNT (@”, A) > e. Thus, the lower bound in (8.27) yields 


1 
hT (@, A) > —hONT (@", A) > = >0. 
n n 


While in a commutative context the above relations are equivalent, there are no 
proofs so far that they are such also for quantum dynamical systems. Also, the 
classical versions of relations (8.39)-(8.42) are equivalent to K -mixing which is the 
strongest possible way of clustering. If one wants to relate the behavior of the CNT 
entropy to the mixing properties of the dynamics, among the possible choices, (8.40) 
appears the more appropriate. Indeed, one knows that be (@”, A) < H, (A); this 
is due to the fact that the dynamics usually create correlations between past and future. 
Therefore, if, asymptotically, the equality holds as in (8.40) this means that for large 
intervals between successive events the system is affected by memory loss [258]. 


Definition 8.1.4 (Entropic Quantum K -Systems [257]) A quantum dynamical triple 
(A, ©, w) is called an entropic K-system if, for any CPU map y : M+> A froma 
finite-dimensional algebra M into A, it holds that 


lim hT (0', y) = Ho (y) . 


t+ +00 
This choice is also convenient because the behavior of hn (©”, M) is often 
more informative on the properties of the dynamics than i (O). For instance, if 
(A, ©, w) is an entropic K-system, then 


lim H, (m, O! (M), a 9G) = kH, (M), (8.43) 
t—>-+00 


for all k € N and for all finite-dimensional subalgebras M C A. Indeed, from (8.23) 
it follows that 


hS^T (@", M) < E (M, o'm), ..., O40 M)) < Ho M) , 


whence the result follows by taking the limit lim;_, +0. 
The asymptotic behavior (8.43) is an effective expression of the memory-loss 
properties of the dynamics of entropic K-systems. Comparing the contributions 
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in (8.3) to the n-CPU entropies, one sees that, for large t, the optimal decompositions 
w= dij AG WiO for Hy (M, @'(M),..., ok) (M)) must be such that 


k-1 
Jim (HO) =F H(Aj,.)) =0 (8.44) 
j=0 
dim JA], 5 (ol, Jem), w1"(M)) =H,(M), (8.45) 
ijelj 


k 
where A‘ —_ A, }, Aja = (rj, ij and 


. . W; (k 

i. y (k) Te 2 3 5 } mat 

= w: — ——_ 
Ne Nii 4? t 1j,t J 


i i; fixed i” i; fixed ij,t 


In fact, the difference in (8.44) can at most vanish, but never be positive, while each 
of the summands in (8.45) is bounded by H (M). This also means that the optimal 
decompositions for large n must be such that the corresponding sub-decompositions 
are close to be optimal for H,, (M). 


Example 8.1.3 Let (Ag, OA, w) denote a quantized hyperbolic automorphism of 
the torus, where A is the C* algebra generated by the Weyl-operators Wọ (f), with 
0 =< as >,s € N; these quantum dynamical systems are norm asymptotic Abelian 
(see Example 7.4.12). By using the exponential decay (7.81) of their commutators, 
we shall show that they are entropic K-systems [251]. 

Let y : M > Ag be a CPU map from a finite dimensional algebra M into A. Fix 
€ > 0 and choose k such that 


1 
Hy (y) = h (@4, 7) = a OL o7,...,O,07) —€ 
1 (k) 
> (HAS = yma, ») (8.46) 
1 k—1 ; . ; 
+D DA 5 (uf, 0 OF 070 , wore). (8.47) 


j=0 ij 


Notice that w is the tracial state; thus, its modular operator is trivial whence its 
decompositions can be chosen of the form (see (8.6)) 


w(X; A) 


j 
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for A > Xj = 0 such that >’; Xj = Il. Because of the norm-density of the Weyl 
operators within Ag, given € > 0, we can reach 


p 
Ho = INS Wioy, woy) +e (8.48) 


i=1 


by means of a decomposition w = D | Aiwi, Where 


p 
Xj = We(fi)Wo( fi), >) Wo fi)Wo(fi) = 1, 


i=l 
with functions such that fi (—n) = fi(n). Moreover, we can arrange them in such 
a way that fj(n) = 0 forall 1 < j < p—1if||n|| > K, while ||Wo(fi)|| < €. 
As a trial decomposition for H,, (>, (JN ORe AEE of”) , consider the positive 
operators 


L 


YO) = Wol fio) ORIWA Sa) OLTP IW na), 


t 
W (k) (k) 
XO, = ER Y Ws where 


where i® e R. They satisfy }°;«) X O = ll; further, 


I na (k) 
Xii an > XO 

i® ij fixed 

jt T 

= of ( x (iat) Xi; Apical) (8.49) 

ijtietk-1 

k-j-1 

Yini] = OMLWoCfi I <<: ORTI TWoCfig_ D1. (8.50) 


Using the tracial properties, the coefficients of the convex decompositions 


Z J J 
a= ye iit are 


Ala = W(X} 4) = w(Xi;) = Ài; i (8.51) 
Therefore, in (8.46), H(A; s) = H(A) for all 0 < j < k — 1, where, in terms of 
p 
n(x) = —x log x, H(A) = ) (i). Therefore, 


i=l 


1 = 1 
7 (HAM) -P HA) = HAP) - HA) (8.52) 
j=0 
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In order to control the probability distribution A® consisting of the coefficients 
Mop we expand 


Wolfi) = X fir) Wor), mr € Supp(fi,) . 


By means of the Wey] relations (7.29), they thus read 


k-1 
See? E (1ra) 
no-Ak-1 r=0 
mọo...mk—] 
k-1 
x w(Wo( >> "nj — mj))) where 
j=0 
k-la-1 
pdn), (m) = X Y (oB" ny, Bn) — o(B" my, B™m,)) 
a=1 b=0 
k-1 
w(Wo( DOB (ny = m;))) = òptima 
j=0 


where B = A’ is the transpose of the dynamical matrix A. By expanding |nj — 
mj) = yj|a4+) + 6;|a_) along the eigenvectors of B, one gets 


k-1 


£ qjat )laz) + OF yor) a_) =0, 
j=0 


j=0 


where aœ > 1 ad a! are the eigenvalues of B, whence 


k-2 k-1 
VR-1 + Sa t =0 = ð + X ya : 
j=0 j=l 


Suppose 1 <i; < p— l foralli; € i®, then the fi; have compact support and, for t 
sufficiently large, the above equalities imply yk-1 = 0 = ôo. Iterating this argument, 
one gets nj; = mj, for0 < j < k — 1, whence 


k-1 k-1 
Mo,= D [ees ea). 
j=0 


no...Nk—-| r=0 
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~ p-l 
Then, ew IAe =k 5 n(À;), where >> denotes the sum over i; # p for all 
i=l 


0 < j < k -— 1. Therefore, 


1 1 7 i ‘ 
THAP) = 7 SION) = Tono p) = H(A) -= nA). 


i 


Furthermore, from the assumptions, Ap = w(X p) < £; thus (8.52) can be estimated 
as follows 


k-1 
~(HA®) -5 H(Aj,.)) > eloge. (8.53) 
j=0 


In order to lowerbound (8.47), we first rewrite it by means of (8.51) as 


k-1 ; P 
TEENS (uh, o0 07, woy) = S(wog) -Y AS (wio) 
j=0 ij i=l 
k-1 
+E ¥,(8 (44, 09) - S (uf, 0% 07). 
j=0 i 


Secondly, by means of (8.48), we lowerbound it by 


H,(y) -E + 5D (wi,07)—S(wj,,e0"e7)). 85 


j=0 ij 


Thirdly, we consider the expectations 


j j t 
w(x}, o” IW 8) = 3 w( (Fiir) Xi, Yiyi Wo(8)) 


ij+1--ik—1 
; 
= w(Xi; Wo(g)) + > a| Giria) Xi; [ina was) |) > 
ij+1--İk—1 


and expand the commutator as 


k-j-1 


[Yiia WD] = D> OAW 


a=1 
OL Wia D [ORW Fijsa)]> WO, OL W Gia] 
e OTITP IW Sira). 
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Since Wo(fi)Wo(fi) < 1 Wo( fj) WoC fj) = L, then || Wo(fi)\l < 1; therefore, 
using (7.81), one can seinnte 


[Ysera WoC] < 2 ic [WoC fijsa1+ Wola) || 


< Se = N fia a)l] 180Ma)| Crama - 


a=1 Ng,Mgq 


Consequently, if all functions have compact support, the commutator goes to 0 expo- 
nentially fast with n; now, the function fp not necessarily with compact support is in 
any case such that || Wọ(fp)|| < £ and any element in y(M) can be approximated in 


norm within £ by suitable Wg(g) where g has compact support. Therefore, one can 
adjust n so that lo oy—wij o al < £, whence (8.54) and the Fannes inequality 


(5.167) yield 


at 


k-1 
1 ; l , 
k yra race o@loy, wo) > Hu (y) — e€ + h) 


j=0 ij 


where (ce) —> 0 when £ — 0. Together with (8.46), (8.47) and (8.53), this last 
estimate proves the result; indeed, for all y : M > Ag and £ > 0, one can choose t 
large enough so that 


Ho (7) = BENT (04, y) = Hu (7) + € loge — 2e + he). 


The previous result puts into evidence the role of asymptotic commutativity in 
establishing the existence of a memory loss mechanism. One wonders whether the 
vice versa is also true, namely whether asymptotic memory loss implies asymptotic 
Abelianess and to which degree. The following result whose proof can be found 
in [58,264] gives a partial answers to this question. 


Proposition 8.1.12 Let (A, ©, w) be a quantum dynamical triple with A a hyperfi- 
nite von Neumann algebra of type II, equipped with the tracial state w. Then, this 
dynamical system is strongly asymptotically Abelian. 


The following corollary regards quantized hyperbolic automorphisms of the torus 
with rational deformation parameter 0 = p/q which are algebraic quantum K- 
systems and expresses the generic in equivalence of this notion and the one of entropic 
quantum K -system. 


Corollary 8.1.1 The quantized hyperbolic automorphisms of the torus with rational 
deformation parameter 0 = p/q cannot be entropic K -systems. 
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Proof As seen in Example (7.4.13), these quantum dynamical systems cannot be 
strongly asymptotically Abelian. 


8.2 AFL Entropy: OPUs 


The quantum dynamical entropy developed by R. Alicki and M. Fannes [10] is based 
on an earlier approach of Lindblad [230] to the non-commutative generalization of 
the KS entropy and considers the description of C* quantum dynamical systems 
(A, ©, w) by means of quantum symbolic models. In analogy with classical symbolic 
models (see Sect. 2.2), the time-evolution © is coarsely reconstructed by means of a 
shift automorphism ©, on a quantum spin half-chain Ax (see Sect. 7.4.5) equipped 
with a particular (unlike for classical dynamical systems, in general not translation- 
invariant) state wz. We shall denote these quantum symbolic models by quantum 
dynamical triples (Ax, @,, wx), where the subindex ¥ denotes the fact that they 
are constructed by means of operational partitions of unity (OPUs) in a way that can 
be physically interpreted as corresponding to repeated measurements performed on 
the system (A, O, w). 


Definition 8.2.1 (Operational Partitions of ee a operational partition of unity 
in A is any finite collection of operators Z = {Z; Ve i= a Zi € A, such that 


IZ | 
ee (8.55) 


where |Z] is the cardinality of Z. 


OPUs correspond to the POVM measurements typical of quantum information 
(see Definition 5.6.1); as already observed, they are the most general algebraic exten- 
sion of the notion of classical partitions to quantum systems. Furthermore, we shall 
see that OPUs can profitably be used instead of or even in a classical context. 

Given two OPUs Z; = {Z; a and Z2 = {Z2} We , the algebraic extension of 
the notion of refinement of two partitions (see Sect. 2. Ji is as follows 


ZıllZ2l 
EE E (8.56) 


What one gets by this definition is a finer OPU; indeed, 


|Z1| |22] |Z2| 


+ oft t 
> Zj Zii Ziu Z2j = » Zy; 22) = 1. 
i,j=l j 


Moreover, consider a measure-theoretic triple (X, T, p) and the corresponding von 
Neumann algebraic commutative triple (Ly (X), Or, wu). One can associate to two 
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finite, measurable partitions P = {P; pel and Q = {Q}; pel of the measure space V 
the OPUs Zp and Zo from Lie (X) consisting of the characteristic functions x% p, and 


[P112] 


xq, of their atoms. It then turns out that Zp o Zo = {XP;no;}; j=1 


to the refined partition P v Q. 


corresponds 


Definition 8.2.2 Given an OPU Z = {Z;}!2| C A, its time-evolution at time t = 
k € Z under the dynamics @ is defined as 


ZÝ := O¥(Z) = forzo} , (8.57) 


Further, Z™® will denote the OPU 
z“). Zo O(Z)o--- e"-|(z) = {Zim ho 2” (8.58) 
L E |Z] 


where Zj = O”! (Z1) O(Zi,)Zig 5 (8.59) 


and ag, 3 i” := igi, ---in_y with ij € {1,2,...,|Z]}. 


Note that Z* is an OPU follows since @ is an automorphism of A: 


|Z| |Z| 


Loze) = O(} ZZ) = OM) = 1. 
i=l 


i=l 


Then, Z™ is also an OPU.. 


8.2.1 Quantum Symbolic Models and AFL Entropy 


Given an OPU Z = {Z;}'2!,, 


finite-dimensional Hilbert space C!ZI, The |Z| x |Z| matrix 


let {| z; jl denote a fixed orthonormal basis in the 


|Z] 
Miz\(C) ə olZ1:= D> Iz )(zj | o(Z}Zi) (8.60) 
i,j=l 


is a density matrix. Indeed, normalization comes from Definition 8.2.1, while posi- 
tivity is ensured by the fact that 


|Z| 
(Yle) = D0 ote; w(Z}Zi) =Z} Zy) = 0, 


i,j=1 


where Cl#! > |Y) = =r Wil zi) and Zy := ely 
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Furthermore, from w o © = w, it follows that 
AZ| = pl[Z] VteZ. (8.61) 


Consider the time-refined OPU 2“; the corresponding density matrix is of the 
form 


Myz\(C)®" > AZ” = > | ziw )( Z jam) | w(Zio Zo) 3 (8.62) 
i), jOE R 


where 
[ziw ) =| Zi ) Blz) @--+ | Zin) 


At each iteration of the dynamics ©, one component is added to the OPU Z™ and 
one factor to the corresponding algebraic tensor product Mz, (C)®". Therefore, to 
any given OPU Z C A there remains associated a quantum spin half-chain Az (see 
Sect. 7.4.5), with a | Z|-dimensional spin at each site and a family of density matrices 
p [Z w], n € N. Since © is an automorphism, applying Definition 8.2.1, it turns out 
that these density matrices form a compatible family in the sense of (7.85), namely 


Trine) (elZ"*?)) = Tran ( > | Zt) )(Z janet) |w (Zh ain Zinn) 


p41) 
jatd 
+ 
= J leo 2500 llela (Zhen Zeon ) 
j(n+1) 
a 
j + 
= > | ziw ){ Z gin) | Zala O” (Z; Zi)Zo) = pLZ™] à 
i0, jo i=l 


Thus the family p[Z“”], n € N, provides a state wz over Az. 

The dynamics © on A corresponds to moving right along Az with the shift 
automorphism ©,; however, unlike the states of quantum spin chains (see Defini- 
tion 7.4.11) which are ©-invariant, the compatible family {pLZ™ nen, need not 
satisfy condition (7.86); namely, in general, wx o O, # wy. For instance, in gen- 
eral, 


Zi Zi 
To= D> lea zal y a(R az) Az. 
k=1 


‘j=l 


In the classical setting the KS -entropy is the maximal Shannon entropy per 
symbol over all symbolic models built upon finite measurable partitions. The AFL 
construction defines the quantum dynamical entropy of (A, ©, w) as the largest mean 
von Neumann entropy over all its symbolic models (Az, @,, wz) constructed from 
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OPUs Z chosen from a selected ©-invariant subalgebra Ag C A. Because of the 
lack of translation invariance, itis not guaranteed that the mean von Neumann entropy 
of (Az, O,, wz) exists as a limit. 


Definition 8.2.3 (AFL-Entropy) Let Ap C A be a O-invariant subalgebra and let 
Z C Apo be an OPU ; set 


1 
hôFL (©, Z) := lim sup -S (21) , (8.63) 
n>oo M 
where S (AZ *)}) is the von Neumann entropy of the density matrix associated with 
the OPUs Z™ . The AFL -entropy of (A, ©, w) is then defined as 


hAFL (@) := sup hAFL (Ø, Z) . (8.64) 


ZeAg 


When needed, we shall explicitly refer to the dependence on Ag by writing 
ho, Ao). 


As for the CNT entropy (compare (8.27)), when one considers powers 07, q > 0, 
of the automorphism ©, one has the following bound. 


1 
Proposition 8.2.1 For all N > q > 1, it holds that -h™ (0%) > hå™ (0). 
q 


|Z| 


Proof For any given OPU Z = {Z;};"), set 


ZO”) = OQ10-D[Z] o O19" [Z]... O1[Z]o Z. 


Given the OPU Z9), q > 1, one verifies that (Z“)-”) = Z” , Therefore, writing 
n = kp + q with 0 < q < p, by means of (5.171) one gets 


1 
FL (Ø, Z) = lim sup —S (p121) = lim sup 
n>oo nM E366 q 
1 1 1 1 
— lim sup —S (oz ?)) = — lim sup -S (p121) 
4 k>œ k q k>% Kk 


1 AFL (os, z%) , 
q 


1 55 (olz*9*P)) 


IA 


In fact, since the states p[Z“] are density matrices on a spin-algebra M go" = 


Qo Mz (C));, one derives the bound 


|s (aaa?) -s (o12*1)| < S (Pikq.kq+p]) < 4 log ZI, 
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where pk q,k q+p] 1S the marginal density matrix on e272 (Mz\(C)) ;. Since OPUs 


of the form Z are a subclass of all possible OPUs, one concludes 


1 
hAFL (©) = sup hô! (Ø, Z) < —nAF- (04) . 
ZeAo q 


Remarks 8.2.1 


1. Suppose the dynamics is trivial, © = id4, namely @[A] = A for all A € A; 
from the previous result it follows that, if he L (id A) > O, then it is infinite for 
one can choose an arbitrarily large q and id‘, = id 4. This effect is clearly due to 
the perturbing action of the OPUs which themselves act as a source of entropy. 
Therefore, the ©-invariant subalgebra Ao from where the OPUs are taken has to 
be chosen in such a way to minimize these perturbing effects. 

2. The request that OPUs consist of elements from a selected ©-invariant subal- 
gebra Ao C A usually comes from physical considerations. Indeed, OPUs as 
POVMs should correspond to physically realizable measurement processes which 
are always strictly local, namely they should consist of operators from local sub- 
algebras of A. The obvious choice for Ag is thus the *-algebra containing all 
strictly local C* algebras. 

3. Unlike for the CNT entropy (see Remark 8.1.2), it is not known whether an 
equality of the form tert (O01) =4q i (©) holds. Indeed, the key ingredient 
in the proof of this equality for the CNT entropy is its strong continuity which is 
not usable in the case of the AFL entropy. Continuity is also important to check 
on its dependence on the OPUs : for results in this direction see [11,161]. 


8.2.2 AFL Entropy: Interpretation 


Like the KS entropy, one can interpret the AFL entropy as the asymptotic rate of 
information provided by repeated, coarse-grained observations of the time-evolution; 
the difference from the classical setting is in that a coupling to an external ancillary 
system is required. This can be seen by going to the GNS construction (HL, Tw, 2u) 


based on the @-invariant state w, 
Consider an OPU Z = {Z; LEl , the pure state projection onto 


|Z| 
Ho 8 Cl > | WZ) := Ym (Zi)| Ru) @ lzi) 
i=l 
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has the following marginal density matrices (see (8.62)) 


|Z| 
pr = Troal WZ (WE) = $ TZD Ru X Ru Tolz) 
i=1 
= Fal 2,){ 211, (8.65) 
|Z| 
pir = T (1¥Z (YS 1) = X (Qu hratt 2a) lz iey | = aZ. 
i,j=1 


The first marginal state is a mixed state on H, resulting from a POVM measurement 
(see Definition 5.6.1) on the GNS state | 2u% ){ 2w |. This effect corresponds to the 
action of a map which, in the GNS representation, is the dual of the following CPU 
map on A: 


[Z| 
AdAP iz[A] =) Z} AZ;. 
i=l 


Since |W3)(W%| is a pure state, Proposition 5.5.6 ensures that p[Z] and 
FZI 2 )( 2. |] have the same von Neumann entropy. The same argument applies 


to the case of the refined OPU 2: the von Neumann entropy S (oLZ (7) equals 
that of 


Fell Qu Ru J aZ Qa By Za) 9 


°(n) (n) 
i E2 Z| 


with Zin) as in Definition 8.2.2. Using the GNS implementation of the dynamics, 
Ty (O(X)) = Ula, (X)U,,, and the fact that Us| 2u) = | 2u ), one rewrites 


To (Ze) | 2o) = UE Zi DUDIT) Cet Zi Ut Zy)| 2a) 
= US (Usa Zins Wy a Ur) | Qu) 


whence, setting U,,[A] := U, A sae A € A, (*) can be recast as 
9) n + wW 7 
Foll 2.) ul] = U o (U! o FZ) U2.) Ql. 860 
It thus follows that 
S (Eg [l Qu) Ru l) = s (ol) ; (8.67) 


Therefore, the AFL entropy can be regarded as the largest entropy production pro- 
vided by POVM measurements based on a selected class of OPUs and performed at 
each tick of time on the evolving system coupled to a purifying GNS ancilla. 


8.2 AFL Entropy: OPUs 543 


Because of this fact, while the CNT entropy which corresponds to the maximal 
compression rate of ergodic quantum sources, the AFL entropy appears to be related 
to the classical capacity of quantum channels. We shall show this after providing 
examples of quantum dynamical systems where the AFL entropy can be explicitly 
computed. 


8.2.3 AFL -Entropy: Applications 


As for the CNT entropy, we first ascertain whether the AFL entropy reduces to 
the KS entropy when the algebraic dynamical triple (A, ©, w) describes a classical 
dynamical system (1, T, u). 


As already remarked, classical partitions P = {P; ae of X, are associated with 


OPUs Zp = {xp,}!21, then 


n-1 


ZO = Zylo gm 9... Zp = [o"r OE Op, y) xP | 


— [xr _ poraze, nr} = {xp | y 


so that Z™ is the OPU associated with the refined partition P™6 and 


AZ”) = > |z; Zp | H (Pym Pro») 


s(n) s(n) (n) 
een E2 Zp] 


= J [x ){z;m la (Pym) 


(n) (n) 
i EQ Zp] 


is diagonal with eigenvalues u (Pim) so that (see (3.1)) 


s (a201) =- Ý Wy) log p(Pym) = HP®) 
Men? 
1 
li LS (pz) =h" (T, P) . 8.68 
s ep E 


However, in view of the fact that one is free to choose more general OPUs than those 
arising from classical partitions, Definition (8.2.3) may in general lead to an AFL 
entropy of (X, T, u) which is larger than its KS entropy. Actually, this is not the 


6 The notations is that used in (2.40) and (2.39). 
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case: in order to prove it let us consider a generic OPU given by a finite collection 


= {4}! 


of essentially bounded functions such that 


IFI 
DAP = Le Ler). 


i=l 


The corresponding density matrix (see (8.60)) reads 


IF] IF] 
AA = Y Daz f w Yo GODIN] 
i,j=l i,j=1 
= i duteyPee) (8.69) 
x 
with {| ¢;)}'Z! an ONB in C!#! and 


FI 
MF\(C) > PF(X) =| PFa) Mex), | Yr@)) => fi) | zi), (8.70) 


Notice that, because F is an OPU, the Pr (x) are projections onto normalized vector 
states. If an OPU results from the refinement of other OPUs , then the associated 
density matrix is a continuous convex combination of tensor products of projections 
of the form (8.70). Concretely, if F = Fi o Fh o --- Fn, 


Q) Mx) > plFl= f, du) Pr, (x) ® PRE) D- PRA). 8T) 
j=l 


Using this expression it is possible to prove that, without restrictions on the OPUs 
taken from M := LY (X), the AFL entropy of (M, Or, wp) coincides with the KS 
entropy of (¥, T, u). 


Proposition 8.2.2 lA (Or, M) = hK (T) (see Definition 8.2.3). 


Proof Let P = {Pe qı be any finite measurable partition of 4 with p™ = 


LE. mk MeQ” its ar refinement up to time t = n — | and let F be any other 


OPU from M. Given F™ as in (8.58), let 


Pron (x) := PEx) 8 Pri(x) Q- Penix). (%) 
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Since the atoms of P™ are disjoint, one can decompose p[F] into a convex sum 
of other density matrices in MF, (C)®": 


pr) = f duw Pron (x) = DY (Pins) Pie 
xX Meg 


Pi = du (x) Pew (x) é 


1 
ply p” 
HCP, o) Piin) 


Thus, the concavity of the von Neumann entropy (5.166) and the triangle inequality 
(5.171) implies 


s(AF®i) <- E PR oP D WPL) S (ej) 
Mea” Mea” 
=H,P™) + J MPS (om) 
Mea” 


n—1 


< A(P™) + Yo Suey) S$ (ei). 6% 


iMeQm j= =0 


where, from (*), pk := du (x) Pre (x). 


uP”) p® 
uP m) YP i(n) 


Let Ox € M\z\(C) be a projection; since S (Qx) = 0, the Fannes inequality 
(5.167) implies 


S (Pk) < llek — Qella log |F| + nok — Qelly) - 


Notice that each px € C!*! is a continuous convex combination of pure state projec- 
tions Pgo) (x); the partition P is arbitrary and can always be chosen in such a way 
that each p% stays sufficiently close to a projection Q% so that the right hand side 
of the previous inequality can be upperbounded by a quantity independent of n and 
i, Consequently, dividing both sides of (xx) by n and taking the lim sup obtains 


1 
nif’ (Or, F) < lim sup —H,(P™) = hS (T, P) < hÉ (T) . 
ä n>+oo M t H 
On the other hand, from (8.68) one gets 
ho (Or, M) > sup hôi" (Or, Zp) =h}> (T) , 
Zpell i 


Wu 


where JI is the x-subalgebra of M containing the OPUs Zp arising from all possible 
measurable partitions of X. 
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Evidently, one would like to reach the KS entropy by computing the AFL 
entropy on a smaller set than the whole of M = L°°(%). The search for a suit- 
able «-subalgebra Mg C M starts with the introduction [11] of an entropic distance 
between two OPUs F1,2 C M defined by 


AlFi|F2] = S (plFi o F2)) — S A) . (8.72) 


Some useful properties of the entropic distance can be extracted by inspecting 
more closely the consequences of (8.71). Indeed, it turns out that, in a commutative 
context, the entropy of a composite OPU is invariant under permutations of the 
constituent OPUs : 


S (p[F1 0 F2 0 F3]) = S (p[F20 Fi 0 F3)) , (8.73) 
for all OPUs F1,2,3 C M. In fact, because of their tensor product form, the density 


matrices p| F1 o Fa o F3] and p[F2 o Fı o F3] are unitarily equivalent. Also, (5.171) 
yields 


S PLF o F2)) < S (el Fil) + S (elF2)) . (8.74) 
for all OPUs Fi. C M. Indeed, by partial tracing p[F1 o F2] over C'*'!, respec- 
tively C!72!, one gets 


Tr2(p[F1 o F2]) = i du (x) PF, (x) Tr(Px,) = pli] 
Tri lF 0 F2)) = J, du (x) Tr(PF, (x)) Pr) = plF2) . 


A more interesting property is the following one: for all OPUs F12 C M, 

S(plF1 o F2)) > S LFD . (8.75) 
To prove this, consider the case in which the integration measure in (8.71) is a 
discrete probability distribution js = { pii- p thatis p[F o0 F2] = > Pj PAC) S 
Pr, (j). Then, construct the density matrix = 


d 


p3 = >) pili XILO PA) 8 PAC), 
j=l 
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where {| j at is an ONB in C2, where the labels denote the factors from left to 
right. Notice that 


d 
p2 = Tr13(p123) = > Pj PAC) = Pelf] 
j=l 
d 
p12 := Tr3 (p123) = 5 Pil JX IIS PAW) 
j=l 
d 
p23 := Tri (p123) = > Pj PFJ) D PF GQ) = plFi o Fa). 
j=l 


Then, strong subadditivity (5.172) yields (8.74): indeed, because of Remark 5.5.5 
and of the fact that Pz, (j) projects onto a normalized vector in CIF, it turns out 
that 


S (p123) = S (p12) = — È pj log pj . 
J 


Finally, probability distributions given by regular Borel measures jz can be approxi- 
mated by discrete ones; thus, the inequality (8.75) can be extended to generic u [11] 
by means of the continuity of the von Neumann entropy (5.167) (notice that all the 
density matrix considered act on a Hilbert space of fixed finite dimension). 


Lemma 8.2.1 Given OPUs Fi 2,3 C M, the entropic distance satisfies 


ALFF] = 0 (8.76) 
Al@OrlFi]|Or[Fo]] = AlFi|F2] (8.77) 
ALF o F2|F3] < ALFF] + AlF2|F3| (8.78) 
ALF |F2 0 F3] < ALF lF] (8.79) 
ALF® FS?) < nAAL F] . (8.80) 


Proof Positivity is a consequence of (8.72) and (8.75) while time-invariance comes 
from (8.61). Subadditivity in the first argument can be derived as follows. By 
using (8.73) one gets 


ALF 0 F2|F3] = S (plF1 0 F2 0 F3)) — S (plFi 0 F2)) 
= S (pF) 0 F3 o F2]) — S (PLF 0 F2)) . 
Setting p123 = plF¥1 o F3 o Fa], it turns out that p2 = Trı3 (p123) = plF3], while 
p12 = Tr3 (p123) = plFi ° Fz] and p23 = Tr1 (p123) = plF2 ° F3]. Then, strong sub- 
additivity (5.172) yields 


S(p[F1 o F3 0 F2)) + S (PLF) < S (plFi 0 F3]) + S (plF2 0 F3)) , 
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whence 


ALF o F2|F3] < S(plF1 o F3]) + S OLF 0 F3]) — 2S (plF3)) 
= ALF F2] + ALF F3] . 


Further, using (8.78) and (8.73), 


ALFF o F3] = S (PIF 0 F2 0 F3]) — S (plF2 0 F3)) 
= A[F 0 F3| Fa] — AlF3|F2] 
< AAP + A[F3|F2] — ALFF = AFF]. 


Finally, if in (8.78) one puts F” (see (8.58)) in the place of Fy o Fy and FS” in 
the place of F3, then using of (8.79), (8.73) and (8.77) one gets 


n—-1 n—1 
AIF? IF 31 < E AIFF?) < YO AIF Fak] 
k=0 k=0 
= nA[Fi|F2]. 


Definition 8.2.4 ([{11]) A *-subalgebra with identity Mg C M is entropy-dense 
in M = Lr (¥) if for any finite, measurable partition P of X and any £ > 0 there 
exists an OPU F C Mo such that A[Zp|F] < £, where Zp € M denotes the OPU 
corresponding to P. 


Theorem 8.2.1 Let (Lip (# ), Or, wu) be the algebraic triple corresponding to a 
classical dynamical system (X,T, p). Let Mo C M = L (X) be a O-invariant 
entropy-dense *-subalgebra of M with identity, then 


hy (Or, Mo) = hi? (T) . (8.81) 
Proof Since Mo C M, Proposition 8.2.2 gives hKS (T) > i (Or, Mo). Let 
P be any finite, measurable partition of 7, P™® its dynamical refinement up 
to time t =n — 1 and Zp, z% the corresponding OPUs in M. Fix £ > 0 and 


choose F to be an OPU in the entropy-dense Mo such that A[Zp|F] < £; then, 
using (8.75), (8.73), (8.72) and (8.80) one gets 


s (aztia sP 0 F™}) = S (oF 0 2471) 
= S (AFP) + ALZPF] < S (ØF) + ne. 
By dividing by n and taking the lim sup, (8.68) obtains 


ho (Or, Zp) =h (T, P) < h (Or, F) +e < hg (Or, Mo) +e 


Wy 
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for all e > 0. Therefore, hKS (T, P) < hoe (Or, Mo) for all finite, measurable 
partitions of ¥ whence his (T) < i (Or, Mo). 


Example 8.2.1 Consider the hyperbolic automorphisms of the torus T? studied in 
Example 2.1.3. To the measure-theoretic triple (T?, T4, dr) one associates the alge- 
braic dynamical triple (M, ©4, w) where M := LẸ (T3, O4 := Or, and w is the 
state obtained by integration with respect to dr . We now show that the x-subalgebra 
Mo C M linearly spanned by the exponential functions en (r) = exp (27 in - r) is 
entropy-dense in M. 

Given a fixed N € N, the following collection of exponential functions 


en 


JM 


Fy = | | , Iy = [n = (n,m): -N < ni < N}, 
nély 


where M := (2N + 1), is an OPU ; indeed, 


yf = Open, 


nély nély 


en 


VM 


where Il is the identity function on TŽ. Notice that the functions en are orthogonal, 
namely w(e} €m) = On,m; thus, (8.60) reads 


1 
PFN] = a5 Do len zal, 


nély 


where {| Zn )}nery is an ONB in C™. Then, S (p[Fy]) = log M which is also the 
von Neumann entropy of the state in (8.65), 


1 1 
Fyll Qo Qu l= ze Èo len) enl = GPN 


nEly 


where Py is the orthogonal projection onto the M-dimensional subspace spanned by 
the M orthogonal vectors | en ). Also, we have chosen the GNS representation where 
| 2,,) is the identity function on T? and the action of my (f) on vectors of Tae (T?) 
is the multiplication by f € M: (r|au(f) |W) = f(r)v(r) (see Example 5.3.2.2). 

Consider now a finite, measurable partition P = (P; Y= of T? and the corre- 
sponding OPU Zp; we want to estimate 


AlZP|FN] = S (pl Zp 0 Fl) — S (pF) = S (plZp 0 Fy]) — log M . 
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The von Neumann entropy of p[Zp o Fy] is the same as the von Neumann entropy 
of (see (8.67)) 


1 m 
on = FE, opyll RoN Rol] = 35 D> Dy Tol Pjen)| Ro N Qe lol; e) 
j=lnely 


m 


1 
= J2 Gi PvQj, where (r| Ojlb) = xp, 0Y) 
j=l 


2 Soy ona) = Oi Pu Os 
= 2, Tr(Q; Py Qj)o;, where oj := Ti; Py Op : 


To compute Tr(Q; Py Q j) we use Example 5.2.3.8; since Q; Py Qj = X'X,X:= 
Py Qj, its spectrum is the same as that of X X t= Py Q; Py and thus their traces 
are the same. Since 


Tr(Py Qj Px) = > (en| Qj len) 


nély 


m El dr |en(r)? xp; (x) = Mu(P}) , 
T2 


nély 


it follows that oy = i (Pj) o;. Furthermore, as the a; are density matrices 
with orthogonal ranges, Remark 5.5.5 yields 


-YOU + Y uP) S (03) 


S (on) = 
j=l j=1 
m m Tr(n(Qj Py Q;)) ) 
=> Pt ) BO; + log(Mu(P; 
Da i) 2M i) ( ae og(Mu(P;)) 
= z do THQ; Py Q;))+logM whence 
j=! 
1 m 
AIPIFn] = + SOTO; Py Qj), n(x) = —x logy. 
j=! 


We now conclude tgeh proof by showing that, for N large enough, the right hand 
side of the last inequality can be made negligibly small. As already seen, Q; Py Qj 
and Py Q; Py have the same spectrum; therefore, 


Tr(7(Qj Py Q;)) = Tr(n(Pn Q; Pn)) - 


Further, n(x) > x(1 — x) forall0 < x < landyn(x) — x(1 — x) is bounded; thus, for 
alle > 0, there exists C (€) > 0 such that [11] n(x) < € + C(e)x(1 — x). Applying 
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this inequality to the eigenvalues 7 jx of Py Q; Py and observing that 0 < mjg < 1 
for Py Qj Py < 1, one gets 


1 1 1 
m TOAN Q; Pn)) ‘7; Dnt) < T 2o + Ce) Tjk (l — mk)) 


IA 


1 
e + Cl) (Tr(Py Q; Py) — Te((Pw Q; Prò) 


1 
e+ C©) (uP) - z È (eal Q; Pw Qj len)) - 
nély 
In the second inequality it has been used that the range of Py has dimension M. 
Since the exponential functions |en ) form an ONB in L (T?), by increasing N 

(and thus M) one makes Py — I so that 


1 1 
ap 2o (enl Oj Pr Oj len) = uPA) — = DY) (enl Qj — Pv) Qj len) 


nely nely 


tends to u(P;) when N > oo. 


8.2.3.1 AFL Entropy: Finite Quantum Systems 

Like the CNT entropy, also the AFL entropy vanishes for finite-level quantum sys- 
tems. In order to show this we start by deriving a useful bound [11] on the von 
Neumann entropy S (p[Z]) of a given OPU Z = ae, C BCA), when B(H) is 
equipped with a state represented by a density matrix p. 


|Z| 


Let 0 <r; < land | rj ) be the eigenvalues and eigenvectors of p and {| zj 1 


an ONB in C! l; because of Definition 8.2.1, the vectors 


|Z| 
[WF ) = 9 Zelrj) ® lz) 
k=1 


are orthogonal, indeed (8.55) yields 


|Z| 
(WF wf) =X (rj Zi Zr lre) = (rj lre) Se 
k=1 


Set pz := È; rj| we X we |; then, S (oz) = S (p) and 


|Z| 
Bi (H) 5 pr = Try (ez) = X X rj Ze lri Xr | ZÌ = Felo] 
j k=l 
Z| 
Bi (C!) > pr = Tes (pz) = XC Tr(p Z}Z;) |j XkI = plZ). 
j,k=1 
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Applying subadditivity (5.171) to these marginal density matrices one obtains the 
following upper bound to S (p[Z]). 


Proposition 8.2.3 Let p € B,(H) be a state on A = BŒ) and Z = {Z;}!2| c 
B(H) any fixed OPU ; then 


S (PLZ) < SP) + SE zlp)) . (8.82) 


If B(H) = Ma(C), then the von Neumann entropy of both p and F z[p] are upper- 
bounded by logd independently of Z. Since in the case of a finite-level system the 
dynamics is implemented by a unitary operator which belongs to Mg(C), all OPUs 
from Mq(C) are such that also the refined OPUs Z n) up to discrete time t = n — 1 
also belong to Ma(C). Then, the following results holds. 


Proposition 8.2.4 Let (A, O,,w) be a finite-level quantum system, where A= 
M,(C), w corresponds to a density matrix p € Bı (C®) and © is implemented by 
a unitary U € Mq(C). Then, for all OPUs Z C Ma(C), 


hê! (0, ¥)=0, hee (@)=0. 


Remark 8.2.2 The fact that for finite-level systems the AF entropy vanishes is nei- 
ther a surprise nor is it the end of the story. Indeed, in quantum chaotic phenom- 
ena [94,383], namely when studying the behavior for i —> 0 of quantum systems 
with a chaotic classical limit, the associated classical instability manifests itself in 
the presence of a logarithmic time-scale as in Remark 2.1.3.4. Roughly speaking, the 
explanation of this fact stems from the fact that, in the semi-classical approximation, 
quantizing means operating a coarse graining of the phase-space into atoms of size 
2rħ: this forbids the existence of a bona fide Lyapounov exponent, but makes its 
classical existence felt up to times that scale as — log(ñ/ S) (where ñ is normalized 
to a reference classical action S). In some models, as for instance the quantized finite 
Arnold cat map in Example 5.6.1.3 and the kicked top in [218], the classical limit 
can be mimicked by the dimension of the underlying Hilbert space N —> +00. The 
AFL construction, notably the entropy of a time-evolving OPU , has been applied 
to such cases and proved to increase linearly with the number of timesteps T up to 
T ~ log N [11,13,27]. Interestingly, the AFL entropy has also been applied to study 
the emergence of chaos in the continuous limit of discretized classical dynamical 
systems [26,28], where the suppression of instability also finds its root in a finite 
coarse graining of the phase space. 


8.2.3.2 AFL Entropy: Quantum Spin Chains 

Interestingly, the AFL entropy of quantum sources differs from the CNT entropy 
by a correction term which increases with the dimension of single site algebras. 
According to Remark 8.2.1, the OPUs will be taken from strictly local subalgebras 
of the quasi-local source algebra A. 
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Proposition 8.2.5 Let (Az, w) be a quantum spin chain with single site matrix alge- 
bras Mq(C). Relative to OPUs from any local subalgebra Atp q}, X C A{p,q) E 
Ao := ALe, the AFL entropy is given by 


hAFL (@,) = s(w) + log d , 


where the dynamics is the shift Og over Az, and the translation-invariant state 
wo Os = w has mean von Neumann entropy s(w). 


Proof Because of translation-invariance of w, it is no restriction to take V = 
{Xi} i Xi € Ajo,e}. It follows that the dynamical refinements V © are localized 
within [0, £ + k — 1]. With p = w Mqo,e+x-1] and F yœ [p], both density matrices in 
Mg(C)® +4), (8.82) yields 


s 7 
AFL (@,, X) < lim sup S (0,441) 


+ logd = s(w) + logd , 

k—> œo k 
whence noe (O,) < s(w) + log d. The upper bound is reached by an OPU con- 
sisting made of matrix units ae i= | Po )(go| from the algebra M,,(C) at site 


0, where {|Z Wes is an ONB in C” [161]. 


0 


qi om and 


Explicitly, ¥ = | | 
(qo, po) 


v E 


k—1 
i 
r epog = G) igi 
[tk | Pq Pidi 
d (pq) i=0 


According to (8.60), the matrix elements of the (d k x dk )x (d kxd k) density matrix 
p(X] are thus given by 


k-1 
1 1 pa 
k k) „(k 
Lx! gees = E” (peg) =R” (8 Atha) 

i=0 

a k-1 
= ghno (Bern) 

i=0 i=0 


The expectations on the right hand side of the last equality define the local state 
pio,k-1] := w Mqo,k-1] so that 


1 
pr) = gE lc” ® Pio- => S (ox) = S(pjo,k-11) + k logd 


and hAFL (@,) > hAFL (@,) ¥ = s(w) + logd. 
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8.2.3.3 AFL Entropy: Price-Powers Shifts 

Price-Powers shifts (see Definition 7.4.14) provide non-commutative contexts 
whereby the differences between CNT and AFL entropies can be better appreci- 
ated [14]: indeed, it turns out that, while the former depends on the bit-stream g, the 
latter does not. 


Proposition 8.2.6 Let the triplet Ug, Oz, w) represent a Price-Powers shift with 
bitstream g; then, relative to local OPUs , he (@,) = 1 independently of g. 


Proof As for quantum spin chains, OPUs will be taken from local subalgebras, 
X C Ao = ules; by translation-invariance of the state w, we can always suppose 
x= (1X C Ufo, ¢}, so that xc Ujo,e+k—1]- The latter local subalgebra is not 
isomorphic to a full-matrix algebra, rather to an orthogonal sum of m v; x v; full- 
matrix algebras: Ufo,¢+4—1] = P= M,,(C). 

As a linear space Uo, ¢4.%—1] 18 26+* dimensional (this is the number of independent 
W; that generate it), while each of the contributing My; (C) is a v;-dimensional 
linear space, whence the constraint 2'+* = Le vs. From the splitting of U0, ¢), 
the elements X;%, i® = igi, -++in—1 € a, of X“ canbe decomposed as Xj) = 


m J 
Dij=1 Xj% and 
m 
Pl0,0+k—-1] = Dö; Tia 
ia 


where Tj = =- Li, are tracial states on M), (C), while 0 < 6;, Di —1 0; = 1 account 
for the various multiplicities. It follows that 


k 
px ®] = “os p ) , O P) vp = =T) (ai w) 1K) . 


Then, (8.82), (5.166), (5.165) and concavity of log x yield 


v2 
s(olX) < 375, s (o ®) 410g m < $53; los 5! 


j=l j=l 
m 


< logò v) =e+k, 
j=l 


whence hôF L (@,) < 1. The bound is attained at the OPU consisting of orthogonal 
projectors at site j = 0, 


1+(-)'e 


= {p?, p9}, p?:= 5 
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(k) — -— T10 J 
In fact, Y7 = [piw := ID=- Pi; leat and 


k-1 


ALL}, ja = (Pj) Pj) = os I] Sie je = 2* 550) ja 
l=0 


The last equality follows by using (7.119) and (7.125); w is tracial and the pi orthog- 
onal for fixed i, thus 


k-1 k-1 
Ww (pj© Pv) =W (n P3, I Tr) 
r=0 s= 


EEE T , 0 k—2 „k-11 „k-2, 1 
= N (PY, Pipa Pin) Pika ph) 


oe ee mo, ne = ee 
= ig jo Sint jx—1 Oin—2 jn (ri, Pipa Pig _) Pig_3 ph) al 


k-3 2 1 3 
+ ig jo Sixt j1% (. Pig p [in >, a Pip 3" -) f (*) 


Using (7.119) one calculates 


ae? eee 
[Pii pk? ] = ge a OP) eters. 


pu eee either this commutator vanishes because g(1) = 1 or the operator 
pr ee ata Pi’ p =| (it belongs to the subalgebra U/,,—2,,—1]) cannot be turned into 
an identity by means of (7.117) as all the other p come from different sites. Thus, 


(x) vanishes. Iteration of this argument yields 


—1 k—1 
k k = 
ao a j= m Õie je w (n pa) =2*ð;j. 
l=0 r=0 


The density matrix p|% ()) is thus diagonal with eigenvalues 2-* whence S$ (pL X (k))) 
= klog2 and hAFL (6,) > 1. 


Remark 8.2.3 While the AFL entropy is always log 2 for all bitstreams, instead the 
CNT entropy varies from 0 to log 2 (see (8.35)-(8.37)). Since the bitstream fixes 
the degree of departure from commutativity, this fact indicates the CNT entropy is 
sensitive to the dynamics, but also to the algebraic structure of quantum dynamical 
systems, in particular to whether they are asymptotically Abelian. On the contrary, 
the AFL entropy accounts for the effects of the dynamics not directly, rather through a 
particular family, in general not translation-invariant, of local density matrices over a 
quantum spin chain. As such, it is more sensitive to the properties of the state w and in 
some cases strongly depends on the OPUs that are used to construct the local density 
matrices. The effects of the OPUs are at the root of the fact that the AFL entropy 


556 8 Quantum Dynamical Entropies 


of a spin chain is the entropy density augmented by the logarithm of the dimension 
of the spin algebras, i (O,) = s(w) + log d, whereas the CNT entropy equals 
the entropy density hee (9,) = s(w). If freely chosen and not carefully selected 
from a suitable ©-invariant Ao in such a way that the perturbations are kept to a 
minimum, it may happen that even dynamical systems without dynamics may have 
non-zero AFL entropy. An abstract though revealing example is that of a so-called 
Cuntz algebra [11], namely the C* algebra A generated by the identity Il and by 
linear combinations of products W; := Sj, Sis --- Si, of two isometries S;, i = 0, 1, 
such that 


n 


SiSo=Sisi=1, %S$+ Ss] =1. 


It turns out that 
Si Sı = CSS + S.S1)S, = S$ Si + SÅ Si = Sis, =0. 


Let the Cuntz algebra A be equipped with the tracial state w(W;) = 0 unless 
W; = Il in which case w(11) = 1 and take the dynamics as trivial © = id 4 namely, 
@[W;] = Wi for all W;. While hSN™ (id.4) = 0, the AFL entropy of the OPU 
X = {Si/ V2 o diverges. Indeed, the elements of the partition X ®© are of the 
form Xj@ := 274/2 Si, Six_, +++ Si; thus, 


Xto Xj =2-*S)---S) S} Si Sia Si =2-* iw jo whence 


Uk-1 lk 


s (pix®]) = k log2 = hê! (id 4, ¥) = log2. 


Therefore, using Remark 8.2.1.1, one deduces that, if the OPUs are freely taken from 
A, then hô"! (id 4) = +00. 


8.2.3.4 AFL Entropy: Arnold Cat Maps 

We now consider the infinite dimensional quantized hyperbolic automorphisms of 
the torus in Example 7.4.6, namely the triplets (Mg, O,, w), where Mg is the 
von Neumann algebra generated by the Weyl operators (7.30), equipped with the 
automorphism (7.34) and the © -invariant tracial state (7.34). 

We shall show that, independently of the deformation parameter, when the OPUs 
are taken from the ©,-invariant x subalgebra Ap generated the by Weyl operators 
Wo(f) where the f have compact support, the AFL entropy of (Mg, OA, w) coin- 
cides with the KS entropy [16] (see Proposition 3.1.1). 


Proposition 8.2.7 hee (Oa) = loga for all (Mog, Oa, w), where a> 1 is the 
largest eigenvalue of A (see Example 2.1.3) and the OPUs are taken from the OA- 
invariant subalgebra Ag C Mg which is generated by the Weyl operators We(f) 
with compact Supp(f ). 
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Remark 8.2.4 Since the AFL entropy does not depend on @ and because, for 0 = 0, 
the quantum dynamical system (Mg, @,, w) reduces to the classical hyperbolic 
automorphisms of the torus (see Example 8.2.1), the proof which follows is another 
way to compute hKS (Ta) and thus the Lyapounov exponents. 


The OPUs that will be repeatedly used in the following have the form 


eibi P 
Z= |z = 7 wa] , nez. (8.83) 


Since w(Zi Zi) = i gi Bi-B)) from (8.60) one computes 
P 


P ‘ ei (Gi-F)) 
AZ)= >> lavzylw(ZiZ)= oO la zal 
i,j=l i,jinj=nj 


The index set {1, 2, ... p} can thus be divided into disjoint equivalence classes 


122 .ar= (JE, Valles p : na = ni}. 


i 


If #[i] denotes the cardinality of [i], one can then write 


1 (AB. #li 
AZ:= 0 = Yo etaz uir [IE (8.84) 
[i] P a,beļi] [i] 
where the vectors | [i] ) := i pe e!i| za), are orthogonal. Thus, 
#li #li #li 
$ (eZ) =— tog ME (=) (8.85) 
m P mo oe 


where (2.84) has been used. Consider now two OPUs of the form (8.83), 


1 Pı 2 p2 
Zı = wit? Wo0) ) Z2 = eib? Woni”) ) 
vPI Jia ~y P2 


i=1 
According to (8.56) and (7.29), their refinement is an OPU of the same form 


P1+P2 
Wo(n\? + a} (8.86) 
i,j=1 


eibli j) 


Zı ° Z2 = 
pı p2 
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One now introduces the equivalent classes 
li, j] := {(a,b) : nP +n? =n 40 1 <a<pi<b< p2} , 


Notice that if x € [a], is such that there exist y € [b]2 such that (x, y) € [i, j], then 
this is true for all pairs (u, v) with u € [a]; and v € [b]2. Therefore, one can write 


#li, j] = 5 #[a]ı #[bh . 
[aly :3b s.t. [a,b]={i, j] 


Lemma 8.2.2 The following two properties hold: 


S (p[Z1 0 Z2]) = S (p[Z2 0 21)) (8.87) 
S (lZ o Z2]) > S (p[Z2])) . (8.88) 


Proof The first equality follows from (8.85) applied to (8.86) which gives 


~~ 
SoZ Zl) = Yon (=) | 
eal P\P2 


The second one can be derived as follows: set 


# 
> a 


[a], :4bs.t.[a,b]=[i, j] 


and use that n(xy) = x n(y) + y n(x) = x n(y) to estimate 


, 1 #[a], #[b]2 
S (210 22) = Dna j) D T ) 
iJ] [a] :3bs.t.[a, b]=li j] Y T P1 P2 
, #[a], #[b]2 
> XONG, An > NG JJP . 
[i,j] [a], : 3b S.t. [a,b]=[i, j] >J? P1 P2 
Since 
1 #[a]1 —] 
[ali :3bs.t.[a,b]=Ii,/] NG@,J) pi 
the concavity of 7(x) yields 
#[a] #[b]o #[b]2 
S(plZi 0 Za) = n(—=) = don(—) 
Pı p2 p2 


[i,j] [a] : 3b s.t. [a,b]=[i, j] 
= S(p[Z2]) . 


In fact, by summing over all [7, j], one sums over all [b]2 and [a]1, the latter being 
as many as pı /#[a]. 
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We now concentrate on a special OPU , Z = [z Wo(n ayy where, for 
a fixed q € N, we choose nj := ([a/] —1)n,0 < j < q, with Zan Æ 0. Also, 
[a/] < a denotes the integer part of the j-th power of the eigenvalue a > 1. The 
refined OPU 


Zot = EVN) o OM [Z] o.. O1[Z] 0 Z 


has |27-| = [a7]° elements of the form 
M e-1 
eI Won G)), a GO) = X li- DAD, 
k=0 


where j = joj... je—1 with je € T(g) := {0,1,..., [at] — 1. In order to eval- 
uate S (ol Z4 4), we have to investigate the equivalence classes determined by 
relations of the form n(r®) = n(s), r,s e 1°. By expanding n along the 
(linearly independent) eigenvectors |+) of AT relative to the eigenvalues at!, 
|n) = y|+) + 6|—), one gets 


e-1 
Dx") = DD = AD (EA - AS] + 
k=0 
£—2 
+ Pa DM] = PD) = 0, 
k=0 


By choosing q large enough, such an equality can only be true if re—1 = ke_1; 
iterating this argument yields 


n(r) = n(s) > rO = 5, 


This implies that the equivalence classes contain one element only, [r] = r®, so 
that (8.85) obtains 


1 1 
gf) = aD — q 
7 S (lZ 1) =7 log |Z‘4""’| = loglaf] . (8.89) 


Lemma 8.2.3 hô! (O4) > loga. 


Proof Given the OPU of above, consider the refined OPU 


ZUEDH — @IN 2) o OY] o- OglZ] 0 Z. 
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As already remarked, by refining any pairs of the constituent OPUs one gets an OPU 
of the form (8.83); thus, by repeatedly applying (8.87) and (8.88), one gets 


s (eZee) > s (oz) A 


One finally estimates 


1 
hAFL (Oa) > lim sup ————_—— 
i pee e 


i 1 4 
> — lim sup zS (1 Z" 1) > 
q t++00 { 


S (z101) 


log[af] , 


Ql 


and the result follows by choosing g arbitrarily large. 


In order to reverse the inequality in the previous lemma, we shall consider OPUs 
of the form Z = (WDE; notice that 


Pp 
Wf Wo =1 = Y IAS 1 


i=1 i=l 


as turns out by computing the expectations with respect to w of both sides of the 
operatorial equation and by using (7.35). Let the support of Z be defined as the union 
of the supports of the constituent f;, Supp(Z) := UE 1 Supp( fi), where Supp( f) is 
the set of n € Z? where f(n) 4 0 (see (7.31)). The compactness assumption means 
that, given Z there exist finite real constants g, d such that Supp(Z) © Rg,a, where 


Rea := {n € Z? : |n)=y\ay)+dla_), In < 8, |l <d}, 
with | a+ ) the eigenvectors of A. 


As seen in Example 7.4.6, the vectors tu (Wo(f))| 2u ) in the GNS representation, 
amount to the £? (Z?) vectors | f; ) = {f (n)}nez2; thus, (8.65) gives 


p 
S (elZ) = S e LENS ) < log#(Supp(2)) , (8.90) 
i=l 


where #(Supp(Z)) denotes the cardinality of the support. 
Since the refined OPU Z™ has elements 


Ox Woi D1 OK “WoC firo)] ++ OalWo(fi)] Wolfio) » 
and each OÍ [Wal fi; )] is supported by vectors of the form 


A!|nj) =qjo la+) + 5ja~/|a_), 
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with n; € Supp(/;), it turns out that Supp(Z ©) consists of vectors 


|n(0)) = O e + O rije 
j=0 j=0 


é-1 a é-1 a 
De reer pe Vee 
j=0 j=0 

whence #(Z™) = O(a") and, from (8.90), 


1 
lim sup —S (21) < loga 


n=>+ M 


for all OPUs from the chosen Apo. 
Lemma 8.2.4 h3" (O4) = supze 4, hott (Oa) Z < loga. 


Finally, Lemmas 8.2.3 and 8.2.4 together prove Proposition 8.2.7 


8.2.4 AFL Entropy and Quantum Channel Capacities 


In Sect.7.6.2 we have discussed the encoding of strings i of symbols from an 
alphabet 74 emitted by a classical source by means of quantum code-words p(i™) 
taken from a statistical ensemble with weights p(i wy, In [9] a different encoding 
protocol is proposed; it uses a quantum dynamical system (A, ©, w) and the CPU 
maps M (Ao) > E : At A. The idea is to encode strings i = iiz -+ - in by per- 
turbing the state w with CPU maps E;, at each stroke of time t = j, 1 < j < n. 


Remark 8.2.5 As A is in general a quasi-local algebra, in order that the encoding 
protocols be physically implementable, the CPU maps are chosen to consist of finitely 
many Kraus operators taken from the union Apo of all strictly local subalgebras, 


Ad AE IA= * X} AX» Xix € Ao, 
kelI(i;) 


where È keri) X] xXijk = Il and the index set /(i;) is of finite cardinality. The 
CPU maps are further distinguished in E € M,(Ag) when they are bistochastic, see 
Example 6.3.1.1, in which case they are entropy increasing, and E € M,,(Ao) when 
the Xj, are unitary: M,(A) C Mp(Ao) C M(Ao). 
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In order to proceed with the explicit encoding, it is convenient to pass to the GNS 
triple (Hu, Uw, Rw); set Xijk = Tw (Xi jk) and denote by 


R (B= Y» x}, LB Rik, Be BCHL,), (8.91) 
kel (ij) 


the GNS representation of the CPU maps 
let fF; j be their dual maps acting on Bı (Ho), 


i, as CPU maps on B(H,,). Moreover, 


lj 


BiG) 3P F= Yo Xin X},, (8.92) 
kel (ij) 


and let Uz! [P] := U.pu; denote the Schrédinger time-evolution in the GNS rep- 
resentation. Then, the encoding procedure proposed in [9] is to assign to a string 
iMel 7, a density matrix p(i “)) according to the following scheme: 


1 
Mrs EME) = pE™) = | | [U7 oF, | 1 2u)( Qu 
j=n 
= (Uz! o F;,) o (UZ! o Fi) o + Uz! o Fi, I] Ru) Rul], (8.93) 


The states p(i”) are the GNS representations of perturbed states obtained from the 
given ©-invariant state w as follows: 


n 
Win) =w0 I] li, 00 | =wo (Ej, 0 @) o (Bj, 0 O)o---( 
j=l 


i, 0O). (8.94) 


In 


Example 8.2.2 Let the encoding (8.93) be based on a Bernoulli quantum spin chain 
(Az, Oc, w), where Az consists of single site algebras Ma (C), Oo is the right shift 
and w the product state 


w(A)=Tr(p@ p@---pA), A € Alp a] - 
__ 


p—q+! times 


Since the Kraus operators from the various CPU maps in (8.93) are finitely many and 
belong to local subalgebras of Az, there exists an £ € N such that X; ijk € Aje, 
for alli; € Z4 and k € I (ij). With respect to (8.94), each O = Os shifts the Kraus 
operators of the CPU map to the right by one site; therefore, the X; ijk of E lis 2< 
j < n, will be shifted to the right by j — 1 sites. Therefore, the perturbed states wo) 
have the form 


= 7 (0) (1) a(n—1) n—1 
Wim SWO “ii [0] “iz Ores “in of 5 
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where the Kraus operators of i 1) , OÌ ! (Xi; ), belong to Aj—e+j—1, e+j-1)- 


It thus turns out that the state w,() in (8.94) amounts to a density matrix Pee E€ 


Ate, e—n—1] tensorized with p over the sites k ¢ [£, £ — n — 1]. In the GNS rep- 
resentation, the corresponding P(i®™) in (8.93) acts as a density matrix Pw on 


Tw(Al—e,e+n—1]) 8 Tu (Al-e,e4n—1))- 


Example 8.2.3 Given a quantum spin chain as encoder, single-site encodings turn 
out to be particularly useful; that is, we will use CPU maps consisting of Kraus 
operators X; belonging to single site algebras M4 (C). The following CPU maps are 
three interesting possibilities. 


1. Let {|i YL denote a fixed ONB in Cf; then, consider the purifying maps 
Ma(C) > p> Fil = TO iil, i=1,2,...,d. 
These are the dual maps of the CPU maps 


d 
Mı(C)> AR E p2. WiļJAļiXk|=(iļAļi) ll 


The encoding is performed by choosing X;g = |i)(k|,i,k = 1,2, ...,d, whence 
if Y € A is such that O?-'(¥) e A™® = Aon-i then ww (Y) = (i™ | | 
yerliyyi™, where |i) := |i, ) @|i2) @-+-|i,). As a consequence, this 


particular encoding corresponds to a pari ballon of w such that PRO = 


|i yi (n) E AM, 
2. Consider the discrete Weyl operators in Example 5.4.3 with N = d and perform 
the encoding corresponding to the CPU maps 


Ma(C) > At E,[A] = Wa(n)' A Wa(n) , 


with n = (n1,72),n; = 0,1,...d — 1. The Kraus operators involved are thus of 
the form X;, = Wa(n;) where i = 1,2,..., d? enumerates the d? pairs n and 
k = 1 for all i. Then, choosing Y € A as in the previous case, the perturbed states 
turn out to be 


n 
wim (Y) = Tr | Q Wani pW p OY) |, 
j=l 
corresponding to A™ 5 Pn) = Q= Wa(ni,) p wi (nj; ). 


3. Let p= pan ril ri )( 17; | be the spectral representation of the single site density 
matrix and | ./p) = yy JFj\rj) ®|r;) its purification. Consider the GNS 
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representation where | 2w ) = (| ./o)(./p|)®™; then, the encoding in the previ- 
ous point yields a perturbed state (8.93) that amounts to a local density matrix of 
the form 


Tul A) B TAY 3 Po = Q) Walni;) Q la | VPN VPI Wii) ® Ma . 


j=l 


As in Sect. 6.3.1, the classical source A emitting symbols i” € I 4, With proba- 
bilities p(i) is described as a stochastic variable A“ with probability distribution 
nr”) = { pa ™)} el” The encoding (8.93) provides a statistical mixture described 
by the density matrix 


By (Ho) > Pew) — > pi ypi™) 


s(n) n 
ively 


and any decoding POVM B BO”) = ={B Ph jeIg by means of operators in B(H,) defines 


another stochastic variable B™ . The mutual information (6.44) of A™ and B is 
bounded by the Holevo x quantity (6.45), 


1A; BO) < Gem) -— D> pa™s (BE) , (8.95) 


ier 
and depends on the source probability 7 4(), on the CPU maps implementing the 
encoding € and on the POVM B BM, 


If the encoding (8.93) is based on Bernoulli quantum spin chains as in Exam- 
ple 8.2.2, then 


Bem = J, PEMPEM) =F ll 2211, (8.96) 


- (1) n 
i elj 


where, using the argument which led to (8.66), Y™ is a localized POVM whose 
elements are operators of the form 


on Xi 1kn— oe” (Xi 2kn— MEE X ioko € A —£,£+n-1] > 
with Xijk € Aj_¢,¢). Then, from (8.95), (8.67) and Proposition 8.2.3 
1(A; BO) < S Pew) = S (FHmll 2X2 1) = S (PWI) 
< (n — 2€)(S (p) + log d). (8.97) 


Indeed, p™ [Y] results from the tensor product density matrix p8 “72® on the algebra 
Ma (C)® (n—2£) p 
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Furthermore, if the decoding is operated by means of local POVMs B consisting 
of operators B; € Ajp,q}, then in (8.95) one can substitute the density matrices p(i (m) 


and Pem in the GNS representation with local density matrices pa) and Pe = 
Vm Pai ™) pi (i™). The corresponding Holevo’s bound reads 
1A; B®) <S (08%) J PES (dG). 898) 


»(n) n 
i" elj 


This bound and the fact that the various states are matrices in My(C)® “~ imply 
that, for encodings B by means of generic POVMs in A, 


I(A®; B®) < (n — 28) log d, (8.99) 
while, for POVMs B consisting of bistochastic maps 
1(A™; B™) < (n — 26)(log d — S(p)) , (8.100) 


for the encodings are entropy increasing so that S(p (i™) > S (p8 no), 


Example 8.2.4 With reference to the three encodings in Example 8.2.3, the Holevo 
x quantity, denoted by .1,2,3 for sake of simplicity, depend only on the structure of 
the perturbed states restricted to the first n sites of the quantum spin chain A. By 
choosing uniform Bernoulli probability distributions p(@i™) = I- 1 Pi; Over the 


indices i”, the product structure of the perturbed states yields: 


1. Let 7 = {p; = 1/d}4_,; since Pio = [EE |, SRG) = 0 and 


Qn 
X va) os, = (4 2) = x1 =nlogd. 


i” 


2. Let t = {pi = 1/d2} —13 since R = = Qj Wa(ni;) p Wj (ni), additivity of 
the von Neumann entropy (5. 170) and unitarity of the Weyl operators imply 
s (as ) = nS (p). Further, from (5.32) it follows that ©, Wa (nj) p Wi (ni) = 
d iq. Thus 


Qn 
> pe) oS = (4 2) = x = n(log d — S(p)). 


i™ 


3. Since Po = @j=1 Wa(ni;) ® lal VP) P| wi (n;;) ® lg are pure, SGM) 
= 0. Also, using again (5.32), it follows that 


d 
LY Wa) 8 Wal YP PI Wh (nj) 8 la = 18T VPV) =1@p, 
i=l 
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where Trz denotes partial trace over the first factor. Therefore, choosing the prob- 
ability 7 as in the previous point, 


Yee 75 = (4 DI = x% = n(log d + S(p)). 


i™ 


According to Sect. 3.2.2, the classical capacity of the channel resulting from the 
considered encodings is: 


1 
C:= sup _—_— limsup —1(A™; B®), (8.101) 


rn”) ; En) : Bo n—> oo 


where the supremum is computed varying probabilities, encoding and decoding pro- 
tocols. The following possibilities are envisaged: 


1. entanglement assisted capacity, Cent, when B= {Bilie Ir C BCH) consists of 
bounded operators on the GNS Hilbert space; 

2. ordinary capacities, C > Cp > C,, when B = {Bj}icz, C A and the encoding 
€™ is performed with any localized CPU map (C) with localized bistochastic 
CPU maps (Cp) and with localized CPU maps consisting of unitary Kraus oper- 
ators (Cu); 

3. Bernoulli capacities, C2, > C? > C; 0 > C°, when the supremum is taken over 


input probabilities that factorize mi) = | [ai pi r 


Remark 8.2.6 The entanglement in the capacity Cent is due to the GNS vector | Ru ) 
being entangled over the algebra mu (A) ® m™(A)’ and the considered POVMs B 
consist of generic operators in B (Ha). The entanglement of | Ru ) is most simply seen 
in the case of a Bernoulli quantum spin chain as the one discussed in Example 8.2.3; 
there, the GNS construction amounts to purifying the single site density matrix p 
into a vector | ,/p) € C4 @ C4 which entangles Mg(C) ® Il with 1 & Mqg(C) at 
each single site. 


The entanglement assisted capacity of Bernoulli quantum sources is bounded by 
the AFL entropy, for all triplets (A, ©, w), while the capacity equals the AFL entropy 
in the case of Bernoulli quantum spin chains. 


Proposition 8.2.8 The entanglement assisted capacity relative to Bernoulli classical 
sources encoded by using quantum dynamical systems (A, ©, w), is bounded by the 
AFL entropy: Co, < hAF* (©, Ao). 

Moreover, the capacities of encodings by Bernoulli quantum spin chains 


(A, Oc, wp) can be explicitly computed: 


C? = C? = Cu = Cp = log d — S(p) (8.102) 
œ =C = logd (8.103) 
Co, = Cem = S(p) + log d . (8.104) 
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Proof As regards the first part of the proposition, the source probabilities p(i w) = 
[Tj=1 pi; by assumption; therefore, (8.93) and (8.66) yield 


1 
Bem = D> pé®yYpE™) = [JT] US!o YS pF | 20) 2a 


ier! j=n ijela 


= UZ” o Fool RuN Rwol], Æ = [VPI X i) ieu + 


Then, from (8.67) and (8.95) 1(A™: B®) < S(Pew) = S (p™®[X]), whence the 
result follows from Definition 8.63. 

Concerning the second part of the proposition, for single site encodings of 
Bernoulli classical sources by means of Bernoulli quantum spin chains, we can use the 
resultin Theorem 7.6.4. It ensures that one can always find a suitable decoding POVM 
such that the asymptotic amount of transmitted information per symbol, namely the 
argument of the supremum in (8.101) equals the corresponding Holevo x quantity. 
Consider now the first case in Example 8.2.4, y1/n = log d and (8.99) imply that 
log d < c? <C< log d, whence (8.102). In the second case x2/n = log d — S(p), 
this and (8.99) imply 


log d — S(p) < C? < Cp < log d — S(p) . 
Thus, (8.103) results from the fact that C? < cP and C,, < Cp. Finally, by means of 


the same argument, (8.104) follows from the third case in the quoted example and 
from (8.97): 


log d + S(p) = x3 < Ch < Ce < log d + S(p). 


8.2.4.1 Bibliographical Notes 

The recent book [264] represents a most up to date and complete approach to CNT 
entropy and its applications to C* and von Neumann dynamical systems of physical 
and mathematical origin. It also presents in full detail the approach to the CNT entropy 
developed in [311] and the construction of dynamical and topological entropies due 
to Voiculescu [367]. Another formulation of a quantum topological entropy, namely 
of a quantum dynamical entropy independent of the given invariant state, can be 
found in [185]. 

Older books also dealing with the CNT entropy are [24,268], while a detailed 
presentation of the AFL entropy ad its applications is in [11]. The relations between 
the CNT entropy and the AFL entropy are reviewed in [186]. 

A different proposal of quantum dynamical entropy based on coherent states and 
suitable to applications to quantum chaos and the semi-classical limit is in [93,332, 
333]. Possible applications of quantum dynamical entropies to chaotic phenomena 
in quantum spin chains are discussed in [288]. 
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Quantum Algorithmic Complexities 


As already emphasized in Chap. 4 information is physical and the limits to infor- 
mation processing tasks are ultimately set by the underlying physical laws. For 
instance, the standing models of computation are based on the physics of deter- 
ministic and/or stochastic classical processes; instead, quantum computation theory 
[84,90, 156,198,266] studies the new possibilities offered by a model of computation 
based on quantum mechanical laws. The birth of such a theory finds a technological 
motivation in the high pace at which chip miniaturization proceeds. Indeed, infor- 
mation processing at the atomic level, namely at a scale where the physical laws are 
those of quantum mechanics, might soon become a concrete practical issue [84]. On 
the other hand, a strong theoretical impulse to the development of quantum compu- 
tation theory came from Feynman’s suggestion [139,170] that quantum computers 
might provide a more efficient description of quantum systems than classical (prob- 
abilistic) computers and, above all, from the discovery of quantum algorithms with 
more efficient performances with respect to what is classical achievable [266]. 

A first theoretical step in this direction was the extension of the notions of TM 
and of UTM to those of quantum Turing machines (QTM ) [121] and to universal 
QTM s (UQTMs ) [72]: very roughly speaking, these latter are computing devices that 
work as classical TMs and UTMs , the only difference being that their configurations 
behave as vector states of a suitable Hilbert space. Namely, given any set of possible 
configurations, their linear superpositions are also possible configurations. 

Once the existence of UQTMs is foreseen, a very natural theoretical step is to try 
to formulate quantum versions of the concepts introduced in Chap. 4; in particular, 
by extending algorithmic complexity theory to the quantum setting, one may try to 
set up a theory of randomness of individual quantum states [144] and of quantum 
processes. 

In the following, we shall consider some proposals that have recently been put 
forward concerning different ways in which one might approach the algorithmic 
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complexity of quantum states. All proposals start from the basic intuitive idea that 
complexity should characterize properties of systems that are difficult to describe; 
they can roughly be summarized as follows: 


1. one may attempt to describe quantum states by means of other quantum states that 
are processed by UQTMs [73]: the corresponding complexity will be referred to 
as qubit quantum complexity and denoted by QCg; 

2. one may decide to describe quantum states by classical [365] programs run by 
UQTMs : the corresponding complexity will be denoted by QC, and referred to 
as bit quantum complexity; 

3. one may choose to relate the complexity of a qubit string to the complexity of the 
(classical) description of the quantum circuits that construct the qubit string [247, 
248]. The corresponding complexity will be denoted by QCnet and referred to as 
circuit quantum complexity; 

4. one may extend the notion of universal probability (see Chap.4) and define a 
quantum universal semi-density matrix [143]. There then arise two possible defi- 
nitions of quantum complexity, denoted by QC, that do not refer either to QTM 
or to circuits. 


We shall mainly concentrate on the qubit quantum complexity QCg: it allows 
for a quantum generalization of Brudno’s theorem that will be presented in detail. 
On general grounds, one should not expect the above proposals to yield equivalent 
notions; very likely, each one of them will be sensitive to different specific quan- 
tum properties, as we have seen to be the case with the quantum extensions of the 
KS dynamical entropy. Unlike in the classical domain (see Remark 4.3.1.4) where 
chaoticness and typicalness appear to be equivalent characterization of random bit 
sequences, qubit sequences are likely to be random in different inequivalent ways. 


9.1 Effective Quantum Descriptions 


The notions of qubit and bit quantum complexity are based on the use of QTM 
. In the following, we will not consider what quantum computers might do that 
classical computers do not, nor will we address their practical implementation (see 
for instance [156,266]). We shall simply assume that such devices exist and proceed 
to define: 


. The targets of the algorithmic descriptions processed by QTM ; 
. Which kinds of algorithms are processed by QTM ; 

. How these algorithms are processed by QTM ; 

. which are the outputs of these processes. 


BRwWN eR 


1. In the quantum setting, the targets of the effective descriptions will be qubit 
strings; since one is always interested in targets of increasing length, a convenient 
mathematical framework is provided by quantum spin chains (see Sect. 7.4.5), 
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namely by algebraic triples of the form (Az, ©, w) that have been introduced 
in Definition 7.4.11, with 2 x 2 matrix algebras at each site. As already noted in 
Remarks 7.6.1, in going from bit to qubit strings there are similarities, but also 
differences. In particular, there is a larger variety of qubit strings. Therefore, by 
qubit strings it will be meant any local density matrix corresponding to generic 
mixed and entangled states on local subalgebras A™ = (Mp(C))®". 

2. The inputs to QTM will be generic qubit strings, loosely referred to as quantum 
programs; a subclass of these are the classical programs or bit strings that QTM, 
as extensions of classical TMs, must also be able to process. 

3. While classical TMs ultimately amount to specific transition functions between 
their configurations, QTM are defined by transition amplitudes between their 
configurations which form a Hilbert space. Any QTM will thus identify a specific 
quantum computation that is a specific unitary operators acting on the Hilbert 
space of its configurations. 

4. Finally, the outputs of a quantum computation operated by a QTM will be a qubit 
string read out by a measurement process. 


Within the framework just outlined, the first two generalizations of classical algo- 
rithmic complexity previously mentioned are based on qubit strings effectively 
described by bit strings in the first case and by qubit strings in the second one. 


9.1.1 Effective Descriptions by qubit Strings 


Given the quasi-local structure of quantum sources as C* algebras generated by 
local n-qubit sub-algebras A“ = Mz(C)®*, let us denote by Hy := (C”)®* the 
Hilbert space of k qubits (k € No) and fix in each single qubit Hilbert space C? a 
computational basis |0), | 1). In order to be as general as possible, superpositions 
of qubit states of different lengths k are allowed: they correspond to vectors in the 
Fock-like Hilbert space Hr := (p29 Hz. More in general, qubit strings will be 
represented by density matrices p € Bi (Hr) acting on HF. 


Example 9.1.1 Any bit string i € {0, 1}* identifies a computational basis vector in 
Hpr: the empty string À corresponds to the vacuum | 2p ), the 1-qubit subspace 
Hı is spanned by |0), |1), while the k-qubit subspace Hg is generated by the 
vectors corresponding to the bit strings of length k, i ®© e a. namely by |i (Ky = 
| iji2--- ix ),i; = 0, 1. Generic qubit strings amount to density matrices in i (He<n) 
acting on H<» := @j— Hx, its dimension being )“7_, 2* = 2"+! — 1. 


In the commutative setting, the length of a bit string is simply the number of bits 
it consists of; in the quantum setting, the number of qubits involved fixes the Hilbert 
space dimension. Therefore, the following definition naturally extends the notion of 
length of a program. 
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Definition 9.1.1 The length, £(p), of a qubit string p € Bi (Hr) is 


Lp) := min{n € No |p € BJ Hn)}, (9.1) 


setting £(p) = œ if this set is empty. 


As we shall see, QTM act on and construct superpositions of vector qubit strings 
and, more in general, convex combinations of projection onto vector qubit strings of 
different lengths. Moreover, like their classical counterparts, QTM comprise different 
parts as a read/write head, a control unit and one or more tapes all of them capable 
of being in states that are either Hilbert space vectors or density matrices acting on 
them. Therefore, the QTM configurations too are generically described by density 
matrices acting on appropriate Hilbert spaces. Notice that mixed states are quite 
typical in such a context for they naturally appear when one is interested in the state 
of the read/write head, say, and therefore traces over the Hilbert spaces corresponding 
to the other QTM components. 

As observed in Remark 7.6.1.3, unlike in the classical situation where there are 
countably many bit strings, there are uncountably many qubit strings that can be 
arbitrarily close to one another. In order to quantify how close two qubit strings 
Poe Bi (Hp) actually are, it is convenient to use the trace-distance introduced in 
Definition 6.3.4, D(p, o). 


9.1.2 Quantum Turing Machines 


Any model of computation is based on the physics of the processors perform- 
ing the computations; both deterministic and probabilistic Turing machines (see 
Example 4.1.4) work according to the laws of classical physics. It was Feynman [139] 
who was the first one to argue that quantum processes, to be efficiently simu- 
lated, require quantum computers. Indeed, quantum mechanics allow superpositions 
of states; in the case of TMs, the natural classical states are their configurations 


c= (i) icZ q, k) (see Definition 4.1.1). The main feature of quantum computing 


machines is the possibility of producing and acting on linear superpositions of clas- 
sical configurations, thus of performing in one single step of a computation what, 
classically, would only be achieved by an enormous number of TMs working in par- 
allel (this is quantum parallelism, a phenomenon briefly sketched in Example 6.1.1). 

The Hilbert space spanned by the classical configurations | c ) provides (vector) 
states |Y ) = $. cec %(c)| c ) of the QTM, with Fourier coefficients ¥ (c) that repre- 
sent the complex amplitudes associated to the computational steps c. As in the case of 
PTMs, a quantum computation corresponds to a level-tree with an initial configura- 
tion branching into others, the main difference being that the edges leading from one 
level to the next do not carry branching probabilities, rather branching amplitudes 
that give rise to interference effects. 
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Fig.9.1 Quantum turing 
machines: level tree 


a= 


C21 C22 C23 C24 


Example 9.1.2 With reference to Example 4.1.4 [156], consider the following 
branching tree that resembles the scheme of a Mach-Zender interferometer (see 
Fig. 9.1). Itdescribes a computational process starting off with an initial configuration 
co that branches into two different configurations c11 and c12 at level 1 with ampli- 
tudes ag, := a(coc11) and aoz := a(coc12) = 271, so that |ao1|? + |ao2|* = 1. 
This first computational step is then followed by a second one with two con- 
figurations at level 1 branching as follows: c1; into c21 and c22 with amplitudes 
41i] := a(c11C21) = 1//2 and 412 := a(c11C22) = 1/v2, while C12 into C23 and C24 
with amplitudes a23 := a(c12¢23) = —1//2 and arg := a (c12024) = 1/V2. 
Thus, the overall amplitudes for the 4 configurations at step 2 are 


1 1 
a(c21) := a01 411 = 5? a(c22) := a01412 = z’ 
1 1 
a(c23) := 402423 = = a(C24) := aq2a24 = z: 


The most important difference with respect to classical PTMs is now apparent; 
indeed, consider the case of equal configurations c22 and c23, say C22 = C23 = C*. 
Then, the amplitude for c* is the sum of the amplitudes for c22 and c23, a(c*) = 0, 
whence p(c*) = la(c*)|* = 0. The corresponding destructive interference elim- 
inates the configuration c* from the computation. On the other hand, assume 
C21 = C24 = Cx; these two configurations constructively interfere at level 2 so that 
a(c,) = 1, whence cą appears among the computational steps with probability 
P(cx) = 1. 


The notion of QTM as a computing device working according to quantum mechan- 
ics was first proposed by Deutsch [121]. A full and detailed analysis can be found 
in [4,72] and further developments in connection with the notion of universality 
in [250]. In the following, we shall assume the existence of such machines and 
provide a schematic presentation of how they perform their tasks. QTM work anal- 
ogously to classical TMs, that is they consist of 


1. An internal control unit C with associated Hilbert space Hic linearly spanned by 
the classical control states q;, i = 1, 2,..., |Q|, the typical control vector being 


|Q| lQ] 


lye) =} clag), JOOP =1. 


i=l i=l 
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We shall distinguish special initial and final control states go and q f , respectively. 
2. An input/output tape, whose vector states are of the form 


\Yr)= tojo), 


ack 


where ø € ©” denotes any sequence consisting of infinitely many blanks and 
only finitely many symbols from the alphabet y= {0, 1} (see Sect. 4.1.1). The 
basis states | ø ) correspond to classical tape-configurations and span a (separable) 
tape Hilbert space Hr. 

3. Aread/write head H that can position itself on the tape cells labeled by the integers 
k € Z. The head Hilbert space Hy is formed by square-summable sequences and 
the typical head vector state is 


Yo) =U AWI), SU P=. 


keZ keZ 


A QTM Y will then be described by means of a Hilbert space of the form 
Hy, = Hy ® He ® Hy with the configuration basis vectors |o,q,k) providing a 
distinguished orthonormal basis. 

The time-evolution of standard quantum mechanical systems, that is isolated from 
their environment, is linear and reversible; as any step of a quantum computation 
corresponds, in absence of external noise, to a physical quantum process, it must be 
described by means of a unitary operator Ug : Hy => Hy. 


Remarks 9.1.1 


1. The probabilistic transition functions (4.4) are replaced by a quantum transition 
function which assigns amplitudes (not probabilities) to the transitions (q, 7) œ> 

(q',0', d): 
(q,03q',0',d) +> 6(q,054',0',d) CO", (9.2) 


where C denotes the set of complex numbers a € C, 0 < |a| < 1, such that there 
is a deterministic algorithm that computes the real and imaginary parts of a to 
within any fixed precision 2~” in time polynomial in n. 

Consider the linear operator Uy on Hy whose matrix elements with respect to 
the configuration basis vectors are defined by [272] 


0(q, ok; q',0,,—-1) if kv =k-1 

/ / / — ’ ’ 9k? 

(q ,o ,k | Uy |q, ©, k) — P ae if k =k+ 1 , (9.3) 
where d = +1 identify a head’s movement to the left d = —1), respectively to 


the right (d = +1), and the tape (classical) configurations ø and ø” are such that 
their symbols ø ; = a; for all j Æ k. 
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In this way the quantum transition function 6 identifies the possible transitions 
operated by the linear operator Uy; namely, Us, operates a transition 


tape conf. : o tape conf. o’ 
cell with head on it: k ẹ +> 4 cell with head on it: k + d 
symbol in cell k : ox symbol in cell k : o; 


if and only if 6(q, ox, q’, op, d) #0. 
Using the orthogonality of the configuration vectors, one explicitly computes the 
action of Uy as 


Uylq.o.k)= X` òQ ongo ktd |q',o%,j +d), 04 
q'o, d 


where a’, denotes the tape-configuration with all symbols equal to those of ø, but 
for the k-th one. In [272] necessary and sufficient conditions are given on the quan- 
tum transition function 6 so that Uy acts unitarily on Hy and thus appropriately 
describes a quantum computation as a unitary discrete-time quantum evolution. 

2. A possible model of a QTMis obtained via a quantum circuit consisting of unitary 
gates (see Example 5.5.9), a so-called circuit model. A quantum computation on, 
say, N qubits thus amounts to a unitary operator U acting on a 2% dimensional 
Hilbert space Hy. It requires a certain number of gates to be implemented; if one 
had at disposal all 1-qubit unitary gates plus the CNOT 2-qubit gate, then any U 
would be exactly implementable [198]. In particular, the action of U : Hy > Hy 
on a given state | 7) requires O(2) of these gates to be implemented [248]. 
More constructively, one seeks finite sets of gates G that would provide a so- 
called complete gate basis in the sense that the action of any 1-qubit gate can be 
mimicked by gates from G up to an arbitrary precision: one such set consists [198] 
of the CNOT gate, the Hadamard gate and the 1-qubit gate 


e7iT/8 0 
T = ( 0 eit /8 . 


Consider a generic unitary action U : Hy > Hy of a quantum circuit con- 
sisting of m 1-qubit and CNOT gates; a result known as Solovay-Kitaev theo- 
rem [198,204,248,359] states that that U can be reproduced up to any £ > 0 
by O(m log’ m/e) gates from G with c € [1, 2]. Then, on a given | Y) € Hy, 
|(U — V)| wW)|| < £ where V : Hy + Hy is a unitary operator corresponding 
to a quantum circuit consisting of N (U, £) gates from G, where [248], 


2N 
N(U,2) =O (2" log’ (=)) 
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3. Another model of quantum computation is based on the possibility of imple- 
menting unitary operations on qubits by means of the mechanism outlined in 
Example 6.1.5. This latter is at the root of the so-called one-way quantum com- 
putation [294,384], whereby quantum gates, that is unitary transformations, on 
qubits are implemented by performing measurements, that is irreversible opera- 
tions, on some other qubits, all of them prepared in certain entangled multipartite 
states called cluster states. 


Definition 9.1.2 (QTMs Starting and Evolution Conventions) Given a UQTM 4U 
and an input qubit string o € Bi (HF), we shall identify it with the initial state of a 
quantum computation by 4 again denoted by ø. It corresponds to a density matrix 
acting on Hgy with o written on the input track over the cells indexed by [0, /(a) — 1], 
and blank states # on the remaining cells of the input track and on the whole output 
track, while the control is in the distinguished initial state go and the head is in the 
state corresponding to its being positioned upon the 0 cell. The state U’ (o) of U on 


input ø after £ € No computational steps will be given by L'(c) := U 4o cay 


In the rest of this section we shall deal with the halting conditions for QTM 
and with showing that their actions amount to definite guantum operations, that is 
to trace-preserving completely positive maps on Bi (Hr). For this observe that, in 
accordance to the previous definition, the state of the control after t steps is given 
by partial trace over all the other parts of the machine, that is over the head and tape 
Hilbert spaces, Hc and Hr, respectively, Uc (o) := Try,T (u (c)). 


Definition 9.1.3 (QTMs Halting Convention) A QTM 4 halts at time t € No, that 
is after t computational steps, on input o € i (Hr), iff 


(af BAG) laf) =1 and (qf [UK (0) laf) =O for every t <t, (9.5) 
where qp is a special control state. 


Remark 9.1.2 The above halting convention expresses the possibility of checking 
whether a QTM halts on a certain input by measuring the orthogonal projection 
las )( qf |: if U has halted on input o then, by measuring | qp )( qf |, one ascertains 
this fact with certainty. On the other hand, if 4 has not yet halted then measuring 
las )( qf | has no effect on the still going on computation. In general, for a generic 
inputa =|v~)(vleE i Hr), 0 < (qf | UEC Wyl lar) < 1; in sucha case the 
vector | y) will be called non-halting, otherwise t-halting. Let H(t) c Hr denote 
the set of vector inputs with equal halting time t: their linear combinations are also 
inputs such that 4 halts on them at time t. Therefore, H(t) is a linear subspace of 
Hr; what is more important, if t 4 t’, the corresponding subspaces H(t) and H(t’) 
are mutually orthogonal. Indeed, were this not true, non-orthogonal vectors could be 
perfectly distinguished by means of their different halting times. It follows that the 
subset i (Hr) on which £ halts is the union (J ey i H(t)). 
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It proves convenient to consider a special class of QTM with the property that 
their tape T consists of two different tracks, an input track I and an output track 
O. This can be achieved by having an alphabet which is a Cartesian product of two 
alphabets, in our case X = {0, 1, #} x {0, 1, #}. Then, the tape Hilbert space Hy can 
be written as Hy = Hy ® Ho. 


Definition 9.1.4 (Quantum Turing Machines) A map U : i (HF) > Bi (Hr) will 
be called a QTM, if there is a two-track QTM {4 with the following properties [72]: 


1. the alphabet consists of X = {0, 1, #} x {0, 1, #}; 

2. the corresponding time evolution operator Uy, is unitary,; 

3. if U halts on input o with a variable-length qubit string p € B i (H r) on the output 
track starting in cell 0 such that the i-th cell is empty for every i ¢ [0, £(p) — 1], 
then U (o) = p; otherwise, U (o) is undefined. 


In general, different inputs o have different halting times f(a) and the correspond- 
ing outputs result from different unitary transformations U no, However, notice that 
the subset of Bi (HF) on which U is defined is of the form |] en Bi (H (t)). There- 
fore, by introducing an internal clock that keeps track of the halting times, the action 
of U restricted to this subset amounts to a well-defined quantum operation, that is to 
a completely positive map U : By (Hr) > i (Hr). 


Lemma 9.1.1 (QTM as Quantum Operations) For every QTM 4Y there is a quan- 
tum operation U : i HF) > i (Hr), such that U(o) = Ufo] for every o € 
Uren By I). 


Proof Let B; be an orthonormal basis of H(t), t € N, and 6, an orthonormal basis 
in the orthogonal complement of <y H(¢) within Hr. Let an ancilla Hilbert space 
Ha := €7(No) be added to the QTM, and define a linear operator Vy : Hp > Hy ® 
Ha by specifying its action on the orthonormal basis vectors U;en{B;} U B1: 


_ Uj |b) @ |t) if |b) € 6, 
val) ={ |b) @|0) if|b) By. 


The ancilla acts as a sort of internal clock which registers the halting times of the 
components of a vectors belonging to the halting subspaces and assigns time 0 to the 
non-halting components. With 6; = {| bi )}, B+ = {| b} )}: 


Hy @Ha 3s 1%) = >> CLG 15.) + Do CeW1b;), 
t=0 jr J 
Val ¥) =} 9 CU) Ubi) Blt) + $ CEU) 1b7) @10). 
t=0 jr j 


578 9 Quantum Algorithmic Complexities 

From orthogonality, it turns out that the map Vy is a partial isometry: 
(WIViVul®) =(W1%), Ww, SeHy@Hy. 

Thus, the map at Vyo vi is trace-preserving and completely positive 

(see Sect. 5.2.2). Further, by partial tracing over the Hilbert spaces of the head, of the 


control unit, of the input tape and of the internal clock Hilbert spaces, one obtains 
the quantum operation U[o] := Trc Ha (Vuc Vi). 


We have seen in Chap. 4 that the definition of algorithmic complexity rests on a 
solid ground because the length of the shortest effective description of a bit string is 
essentially independent of the computer that computes it once this is chosen from the 
class of universal Turing machines. Clearly, any definition of quantum complexity 
based on using QTM will also need the existence of universal QTM in order to be 
essentially machine-independent. In [72], a UQTM Y was constructed that works as 
follows: for any QTM X there exists a classical description (bit string) iq of Xl such 
that 


D (Ma, T, (YNYD, WY dD) <4, 


for all inputs |w)(w| € Bi (HF), computational steps T and 6 > 0 with D(-, -) the 
trace-distance introduced in Definition 6.3.4. 

According to this definition { is universal in that it simulates any other QTM up 
to an arbitrary accuracy for a given number of steps; notice that this latter piece of 
information must be part of the input. This means that, if X% halts on a certain input, 
{is able to approximate the output of X only if provided with the halting time. 
While such a definition works perfectly well for the aims of [72] which are directed 
to see the impact of QTM on computational complexity (see Remark 4.1.2), it is on 
the other hand not appropriate for an approach to quantum algorithmic complexity 
simply because the halting times are likely to be enormous and in any case cannot 
be given beforehand. 

A useful definition of a UQTM Y for algorithmic purposes must then be indepen- 
dent from the halting time of the simulated QTM 2. The main problem is that, as 
the simulation is only approximate, such is in particular the simulation of the control 
state of 21 whence, when 2 halts, 4 will in general do it only with a certain proba- 
bility thus violating the halting convention in Definition 9.1.3. In [250] it is showed 
how such a problem can be circumvented and how one can arrive at the following 
operative definitions of UQTM which is fully consistent from the point of view of 
quantum algorithmic complexity. 


Theorem 9.1.1 (Strongly UQTMs) [250] There is a QTM \ such that for every 
QTM 2 and every qubit string o for which (o) is defined, there is a qubit string 
og such that 


D (W8, ox), A(o)) <6 =O E QT, 
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where (see Definition 9.1.1) (oy) < (o) + Cy, D(..-) is the trace-distance and 
Ca € N is a constant depending only on . 


In the following theorem, both the universal simulator, 4, and the QTM to be 
simulated, 2l, are provided with a quantum input and a classical input fixing the 
accuracy 6 of the approximation. 


Theorem 9.1.2 (Parameter Strongly UQTMs) [250] There is a UQTM 4 with the 
properties of the previous theorem such that for every QTM and every qubit string 
a, there is a qubit string oq such that, if A(2k, o) is defined, then 


1 
D (Uk, oq) , A(2k, o)) < ak Vk EN, 
and l(0q) < L(a) + Cy, Ca € N depending only on «A. 


This result is not just a corollary of the preceding theorem: indeed, according to 
Theorem 9.1.1, the input, oq, may in general depend on k. 


Remark 9.1.3 A UQTM is able to apply a unitary transformation U on some seg- 
ment of its tape within an accuracy of 6, if it is supplied with a complex matrix U as 
input such that 


as < oe 
IU -ÜI < ae 
d being the size of the matrix. The machine cannot apply U exactly; in fact, it 
only knows an approximation U. It also cannot apply U directly, for U is only 
approximately unitary, and the machine can only work unitarily. Instead, it will 
effectively apply another unitary transformation V which is close to U and thus 
close to U, such that ||V — U || < 6. Let |Y) := U |o) be the output that one wants 
to have from U and let |9} := V|wWo) be the approximation that is really computed by 
the machine. Then, both the norm and trace-distance are small: |||¢) — | w)|| < ô, 


DION blew) <s 
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As already remarked, unlike bit strings, qubit strings, are uncountably many and 
cannot be expected to be exactly reproducible by a QTM . It rather makes sense to 
try to approximate a target qubit string p by a qubit string p within a trace-distance 
0 < D(p, P) X 1(p ~ p). According to the previous section, p will be the output of 
a QTM 4 that executes a quantum program o € i (Hr): p := Ulo] ~ p. 
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Remark 9.2.1 In view of the definition of classical algorithmic complexity, one is 
particularly interested to seek whether the length of o (see 9.1) can be made shorter 
than that of p itself: (0) < €(p). The minimum possible length ¢(c) for reproducing 
p will get us close to the notion of qubit quantum complexity QCg. There are at least 
two natural possible definitions. The first one is to demand only optimal (in the sense 
of minimal length) approximate reproductions of p within some trace distance ô. The 
second one is based on the notion of an approximation scheme. In order to define 
the latter, the chosen QTM has to be supplied with two inputs, the qubit string and 
a parameter. 


Definition 9.2.1 Letk € Nando € i (Hr). Let 8(k) denote the string that consists 
of the at most |log, k] bits of the binary expansion of k, each repeated twice and 
ends with 01. Let | G(k) )( G(k) | be the corresponding projector in the computational 
basis. The map (k, 7)  C(k, a) := | B(k))( B(k) | ® o defines anencodingC : N x 
Bi (HF) > Bi (Hr) of a the pair (k, c) into a single qubit string C (k, a). Note that 


£(C(k, o)) = 2|logk] +2 + £(0). (9.6) 


We shall denote by U(k, o) the result of the action of a QTM 4 on C (k, o). 
The above encoding has the typical self-delimiting form that we have already met 


in Example 4.1.5. In this way, the QTM Lis able to detach in C (k, o) the information 
about k from that about ø. 


Definition 9.2.2 (gubit Quantum Complexity) Let U be a QTM and p € Bi (Hy) 
a qubit string. For every ô > 0, the finite-accuracy quantum complexity QC? (p) is 
defined as the minimal length (o) of any quantum program o € B i (HF) such that 
the corresponding output U(o) has a trace-distance from p smaller than ô, 


QC) (p) := min| €(0) © D (p, M(a)) < ô} : (9.7) 


Similarly, an approximation-scheme quantum complexity QCy, is defined as the min- 
imal length €(c) of any density operator o € B i (Hr), such that when processed by 
U together with any integer k, the output U(k, o) has trace-distance from p smaller 
than 1/k, for all k: 


QCy(p) := minf €(0) : D(p, Wk, o)) < i for every k € N} ' (9.8) 


We now show that Theorems 9.1.1 and 9.1.2 allow one to prove the independence 
(up to an additive constant) of the above definitions from the chosen QTM 4 if this is 
universal as specified in those theorems. Accordingly, we will fix an arbitrary UQTM 
and, like in the classical case, drop reference to it and set 


QC; (P) = QCu(p), QC? (p) = QCÌ CP) - (9.9) 


9.2 qubit Quantum Complexity 581 


Theorem 9.2.1 There is a QTM \ such that for every QTM % there exist constants 
Cy > 0 and Cy,5,4 such that for every qubit string p € Bi (Hr) and0 < 6 < A, it 
holds that 


QCy(p) < QCa(p) + Ca, QC) < QCC) + Casa - 


Proof Let € = QC) (p), then there exists o such that, according to (9.7), € = £ (0) 
and D(2[o], p) < 6. On the other hand, Theorem 9.1.1 implies that there exists a 
QTM 4 and a density matrix oq such that 


D(U(A = 5, oa), M(o)) < A-8 
whence, by the triangle inequality, 
D(WA -5 ox), p) <A. 


Moreover, (aq) < L(a) + Cy = QC% (p) + Cy; thus, with C(A — ô, oq) as in 
Definition 9.2.1, using (9.6) it follows that 


€(C(A - 6, 0)) < Com) + Csa < QCR) + Casa 


whence QC (p) < QC (0) + Casa. 
If £ = QC (p), then there exists a qubit string o such that £ = (o) and 
D(A(k, o), p) < 1/k for all k € N. On the other hand, Theorem 9.1.2 says that there 
1 
exists a QTM {4 and a density matrix oq such that p(u&k, og), 2(2k, o)) < zp 
It follows that 


D(Uk, ox), p) < D(Uk, oa), Ak, o)) + D(ALK, 0), p) 
PESES 
T2k 2k k` 
Together with the fact that (oy) < L(a) + Ca < QCy(p) + Cy, this implies 
QCy(p) < QCy(p) + Cy. 


Remarks 9.2.2 


1. Definition 9.2.2 is essentially equivalent to that in [73], the only technical differ- 
ence being the use of the trace distance rather than the fidelity. 

2. The same qubit program a is accompanied by a classical specification of an integer 
k, which tells the program to what accuracy the computation of the output state 
must be accomplished. Notice that in (9.8) the minimal length has to be sought 
among those ø such that anyone of them yields an approximation of p within 1/k 
for all k: this is an effective procedure. 
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3. The exact choice of the accuracy 1/k is not important; choosing any computable 
function that tends to zero for k —> oo will get an equivalent definition (in the 
sense of being equal up to some constant). The same is true for the choice of the 
encoding C: as long as k and o can both be computably decoded from C (k, o) 
and as long as there is no way to extract additional information on the desired 
output p from the k-description part of C (k, øo), the results will be equivalent up 
to a suitable constant. 


Examples 9.2.1 


1. If U is a UQTM, a noiseless transmission channel (implementing the identity 
transformation) between the input and output tracks can always be realized: this 
corresponds to classical literal transcription, so that automatically Qc? (p) < 
£(p) + cy for some constant cy. Of course, the key point in classical as well as in 
quantum algorithmic complexity is that there sometimes exist much shorter qubit 
programs than just literal transcription. 

2. The finite accuracy and approximation scheme QC, are related to each other by 
the following inequality: for every QTM Y and every k € N, 


QC," (P) < QC, (P) + 2[logk] +2, Vp € B} Hp). 


Indeed, if QCy(p) = £, there is o € Bi (HF) with (o) = £, such that 
D(U[k, o], p) < 1/k for every k € N. Then o’ := C (k, o), where C is the encod- 
ing in Definition 9.2.1, is such that D (U[o’], p) < 1/k and 


ga (p) < £0") < 2|logk| +2 +£ = 2|logk] +2 + QC, (P) , 


where the second equality follows from (9.6). 


9.2.1 Quantum Brudno’s Theorem 


In this section, we prove a quantum version of Brudno’s theorem (Theorem 4.2.1), by 
means of which we shall connect the quantum entropy rate s of an ergodic quantum 
spin chain to the qubit complexities QC, (p) and Qc? (p) of qubit strings that are 
pure states p = |w)(w| of the chain. It will be showed that there are sequences of 


1 1 
typical subspaces of (C*)®”, such that the complexity rates —QC, (q) and -QC (q4) 
n n 


of any one-dimensional projector q onto a state belonging to these subspaces can 
be made arbitrarily close to the entropy rate by choosing n large enough. Moreover, 
there are no such sequences with a smaller expected complexity rate. 


Theorem 9.2.2 (Quantum Brudno’s Theorem) Let (A, w) be an ergodic quantum 
source with entropy rate s. For every ô > 0, there exists a sequence of w-typical 
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projectors qn(d) € AM, n EN, i.e. limp soo Tr(p™ Gn (5)) = 1, such that for every 
one-dimensional projector q < qn(6) and n large enough 


= OC; (q) € (s —6,5+6) , (9.10) 
1 
-0c (q) € (s —6(2+.6)s,5 +6) . (9.11) 


Moreover, s is the optimal expected asymptotic complexity rate, in the sense that 
every sequence of projectors qn € A™, n € N, that for large n may be represented 
as a sum of mutually orthogonal one-dimensional projectors that all violate the 
lower bounds in (9.10) and (9.11) for some 6 > 0, has an asymptotically vanishing 
expectation value with respect to w. 


As for the proof of Brudno’s Theorem 9.2.2, we first prove upper and then lower 
bounds [55]. 


9.2.1.1 Lower Bounds 

In the classical case, it has been showed that there cannot be more than 2°+! — 1 
different programs of length £ < c and this fact has been used to prove the lower 
bound to complexity in Brudno’s Theorem. 

A similar result holds for QTM , too. In order to show this, one can adapt an 
argument due to [73] which states that there cannot be more than 2‘+! — 1 mutually 
orthogonal one-dimensional projectors p with quantum complexity QC, (p) < £. 
The proof is based on the Holevo’s y-quantity (see Proposition 6.3.3); we shall 
use it to provide an explicit upper bound on the maximal number of orthogonal 
one-dimensional projectors that can be approximated within trace-distance 6 by the 
action of completely positive maps E on density matrices o of length €(a) < c. 


some? 9.2.1 (Quantum Counting Argument) Let 0 < 6 < 1/e, c € N such that 
= 5 (4 + 2 log 1), K a linear subspace of an arbitrary Hilbert space K, and E : 


ap => TR) a quantum operation. Let N 0 be a maximum cardinality subset 


of AGNA vectors from the set Ay è, XC) of all normalized vectors in K which 
are reproduced within 6 by the operation E on some input of length < c: 


ARE, K) = {16) €K: 30 € BY Œk), D (Elooh 16)( 61) < ò}. 


5 2+6 
Then, log, |N? 1+ ——\6c. 
en, log |Nc| < c+ UE CY 


Proof Letġj € A?( 2, K), j = 1, ..., N, a set of orthonormal vectors and V denote 
the Abelian subalgebra of B(K) generated by the corresponding projectors P; := 
lġ;)(ġ;| and Py+1 := 1g — yy P;. By the definition of AX 2, KC), for every 1 < 
i < N, there are density matrices g; acting on H<, with D(E[o;], Pi) < ô. 
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N 
1 
Let o := N > gi; it also acts on H<, and dim Hz, = 2¢+! _ 1, whence (6.45) 


i=l 
yields x(Es) < c + 1, where Es := {0, oi /N}. Then, consider the completely posi- 
tive map Ey : i (K) > Bi (K), p > Ey[p] := a P;pP;. Applying twice the 
monotonicity of the relative entropy under completely positive maps, 


N N 
1 1 
W 2 S Œv o Eloi], Ey o Elo) < — 9 S$ Eleil, Elo) 
i=l i=l 
< x(Eo) . 
Foreveryi € {1,..., N}, the density matrix Ey o E[a;] is close to the corresponding 


one-dimensional projector Ey[P;] = P;. Indeed, (6.80) yields 


D(Ey o E[oj], Ey[Pi]) < DEloi], Pi) < 6. 


Let A := + TL , Pi. The trace-distance is jointly convex (see (6.81)), thus 


D(Ey o Elo], A) < — $` Dy oEfai)), Pi) < ô. 


Since ô < L, 


Fannes inequality (5.167) gives 


S (Ey o E[oj]) = |S (Ey o Elo;]) — S(P;i)| < Slog, (N + 1) + (6) 


S (Ey o E[a]) — S(A)| < dlog(N + 1) + (6) , 


where 7(5) := —d log, 6. Combining the previous estimates yields 
c+ 1 > x(E) = (1 — 26) log, N — 26 — 27(6) . 


Iflog, N>c+1+ 742 dc, then c + 1 > c+ 1+ (cd — 4) + 26 log ô, whence 


c< 2 (2 + log 1). Therefore, the maximum number |N, : | of orthonormal vectors in 
2+6 


A&(E, K) must fulfil log, |N°| < c + 1+ Tag 


The second step uses the previous lemma together with Proposition 7.6.1 about 
the minimum dimension of the typical subspaces. Notice that the limit (7.153) is 
valid for all 0 < £ < 1. By means of this property, one proves the lower bound for 
the finite-accuracy complexity Qc} (p), and then use Example 9.2.1.2 to extend it 


to QC, (p). 
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1 
Corollary 9.2.1 (Lower Bound for -QC} (p)) Let (Az, w) be an ergodic quantum 
n 


source with entropy rate s. Further, letO < ô < 1/e, and let (Pn) en be a sequence 
of typical projectors, according to Definition 7.6.2. Then, there is another sequence 
of typical projectors Pn(8) < pn, such that for n large enough 


1 5 
mi (P) > s — ô(2 + ô)s 
is true for every one-dimensional projector p < Pn(ô). 


Proof The case s = 0 is trivial, so let s > 0. Fix n € N, 0 < 6 < 1/e and consider 
the set 


An():= [p < Pn: P=1YN Yl, QC) (P) < ns -82+ 8)} . 


From the definition of Qc? (p), for any of such p’s there exists a density matrix op 
with €(op) < ns(1 — 6(2 + 6)) such that D(LU(op), p) < 6, where, as explained in 
Lemma 9.1.1, (cp) is the result of the quantum operation U : a HF) > i (HF) 
associated with the UQTM { that has been fixed as explained before Theorem 9.1.1. 
Then, using the notation of Lemma 9.2.1, LAORE ae (U, K,,), where 
K, is the typical subspace supporting pn. Let pa (8) < pn be the sum of any maximal 
number of mutually orthogonal projectors from Ab, (1—6(2+6))] (U, Kn). If n is such 
that 


1 1 
ns(1 — ô(2 + ô)) > F (4+ 2 log, 5) ; 


Lemma 9.2.1 implies that 
2+6 
logy Tr pn(6) < [ns(1 — 6(2+6))] + 1 + 150C — ô(2 + ô))]. (9.12) 


Therefore, no one-dimensional projectors p < pn (®t := Pn — Pn(0) exist such that 
pE Afn s(1—6(2+8))] (U, Kn). Namely, one-dimensional projectors p < pn (5)+ must 
satisfy 


1 
QC; (P) > s — 6(2 + ôs 
Since inequality (9.12)) is valid for every n € N large enough, 


5645 
1—26 


1 
lim sup — log Trn Pan(ô) < s — 28s — <s. (9.13) 


noo 


From Proposition 7.6.1 limpo Tr(p™ pa (5)) = 0, whence p, (5) := pn(d)* pro- 
vide the required sequence of typical projectors. 
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1 
Corollary 9.2.2 (Lower Bound for —QC, (p)) Let (Az, w) be an ergodic quantum 
n 


source with entropy rate s. Let (Pn)nen With pn € A” be an arbitrary sequence 
of typical projectors. Then, for every 0 < 6 < 1/e, there is a sequence of typical 


= 1 
projectors Py(6) < pn such that, for n large enough, —QC, (p) > s — ô is satisfied 
n 


for every one-dimensional projector p < Py(0). 


Proof From Corollary 9.2.1, for every k € N, there exists a sequence of typical 
projectors Pn(@) < Pn, such that, if n is large enough, 


Lac!’ (p) lasi 
i q p > Ss Ẹ g 


for every one-dimensional projector p < p,(1/k). Then 


1 l nit 
—QC, (p) = -QC (p) — 
n n 


1 1 2(2 + log, k 
>s 2+ S Ele 
k k n 


2 + 2[log, k] 


where the first estimate is by Example 9.2.1.2 and the second one is true for one- 
dimensional projectors p < Paz) and n € N large enough. Fix a large k satisfying 


1 1 ô ay 1 
re + p < z The result follows by setting Pn (ô) = Pa) with k, and n such 
that 


ô  2(2+ log, k) < ô 
A 


la+ h < 
S 3 = 
k k 72 n 


9.2.1.2 Upper Bounds 

The lower bound shows that, for large n, with high probability the qubit complexity 
of pure states of a quantum spin chain is bounded from below by a quantity which 
is close to the entropy rate of the chain. Similar upper bounds also hold from which 
Theorem 9.2.2 follows. 


Proposition 9.2.1 (Upper Bound) Let (Az, w) be an ergodic quantum source with 
entropy rate s. Then, for every 0 < 6 < 1/e, there is a sequence of typical projectors 
Pn(d) € A™ such that for every one-dimensional projector p < py(d) and n large 
enough 


1 L 
—QC, (p) <s+ô and -=QC; (P) < s +ô. 
n n 
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The proof of this statement is obtained by explicitly providing, for any mini- 
mal projector p < pn(6) € A”, a qubit string P of length £(p) ~ n(s + ô), that 
computes p with arbitrary accuracy. Such a qubit string is constructed by means of 
universal quantum typical subspaces introduced (see Definition 7.6.3); its length is 
in general not minimal and only upperbounds the quantum complexities QC} (p) 
and QC, (p). However, it is shorter than the literal transcription of p (see Exam- 
ple 9.2.1.2): recall that the latter corresponds to a qubit string p comprising p itself 
plus the instructions to the UQTM Y to copy p, whence its length €(p) ~n > 
n(s + ô) for n large enough and ô sufficiently small. 

Let 0 < £ < 6/2 be an arbitrary real number such that r := s + ¢ is rational, and 
let {Pn = o”) neN be the universal projector sequence of Theorem 7.6.3, which is 
independent of the given state w as long as s(w) < s. 

Though the dimension of the subspace supporting p, is 2””, generic one dimen- 
sional projections g < pn are not qubit strings as defined in Sect.9.1.1 and their 
lengths need not be €(¢) ~ nr. However, because of (7.154), if n is large enough 
then there exists some unitary transformation U that transforms the projector Pn 
into a projector belonging to the state-space i (Hinr), where [nr] is the smallest 
integer larger then nr. It follows that every one-dimensional projector p < py can 
be transformed into a qubit string p := U' pU of length ¢(p) = [nr]. 

According to Remark 9.1.3, p can be presented to the UQTMY together with some 
classical instructions including a subprogram for the computation of the necessary 
unitary rotation U. This UQTM starts by computing a classical description of the 
transformation U, and subsequently applies U to p, recovering the original projector 
p = U pU' on the output tape. 

Apart from technical details, the main point in the proof is the following: since 
the unitary operator U depends on w only through the entropy rate s, the subprogram 
that computes U does not have to be supplied with additional information on w and 
its restriction to A“. Therefore, the additional instruction for the implementation of 
U will contribute with a number of extra qubits which is independent of the universal 
projection index n. 

The quantum decompression algorithm D will formally amount to a mapping (r 
is rational) 


D:N xNxQx Hr Hr, (k,n,r, lp) p=Dk,n,r, p). 


Remark 9.2.3 The decompression algorithm Ð is due to be short in the sense of 
being “short in description”, not short (fast) in running time or resource consumption. 
Indeed, the algorithm is in general slow and memory consuming; however, this 
does not matter. In fact, algorithmic complexity only cares about the length of the 
programs and not either in how fast they are computed or in how much resources 
they consume. 


In the following steps, D will deal with rational numbers, square roots of rational 
numbers, bit-approximations (up to some specified accuracy) of real numbers and 
vectors and matrices containing such numbers. Classical TMs as well QTM can of 
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course deal with all such objects. For example, rational numbers can be stored as lists 
of two integers (containing numerator and denominator), square roots can be stored 
as such lists supplemented with an additional bit to denote the square root operation, 
and, also, binary-digit-approximations can be stored as binary strings. Vectors and 
matrices are arrays containing those objects. They will be presented to the UQTM 
{M as vectors of the computational basis and operations on them, like addition or 
multiplication, will as easily be implemented as by classical computers. 
The instructions defining the quantum algorithm ® are as follows. 


1. Read n, r; find £ € N such that £ - 27 <n < 2- £ - 2°% with £ a power of two 
(there is only one such £). pe ha ñ := |7]. Compute R := r£. 

2. Compute a list of codewords oF e, belonging to a classical universal block code 
sequence of rate R. The construction of an appropriate algorithm can be found 
for instance in [201]. _ S 
Since ae ({0, 17)”, ay = {wj,w2,..., wy} can be stored as a list of 
binary strings. Every string has length €(w;) = %1 and the exact value of the 
cardinality M ~ ork depends on the choice of on. 

j= i} of the symmetric subspace 


gregh 


3. Compute a basis [At 


SYM*®(A®) := span{A®* i Ae AM}. 


Namely, for every ñ- tuple {i1, ..., iğ}, where iz € {1,..., 274) there is one basis 
element Aji,,.ic} € AGO, given by 
Ati...) = ee”, (9.14) 


eee 


OG ii) ? 
o 


where the summation runs over all 7i-permutations o, and 


eb) = eP (€) (¢) 


Be B.. DeD, 


with fe Ja a system of matrix units in A®. In the computational basis, 
all entries oe ch matrices are zero, except for one entry which is one. There 
is a number of d = ea a> = dim(SY M? (A®)) different matrices Ati, iz} 
which we can label by {A;}4 z—1: It follows from (9.14) that these matrices have 
integer entries and can thus be stored as lists of 27t x 27 tables of integers 
without any need of approximations. 

4. For every i € {1,..., M} and k € {1,..., d}, let |ug, i) := Aglwi), where |wi} 
denotes the computational basis vector which is a tensor product of |0)’s and 
|1)’s according to the bits of the string w;. Compute the vectors |ug, i) one after 
the other. For every vector that has been computed, check if it can be written as 
a linear combination of already computed vectors. 
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(The corresponding system of linear equations can be solved exactly, since every 
vector is given as an array of integers.) If yes, then discard the new vector |ux,;), 
otherwise store it and give it a number. This way, a set of vectors {lux}? 1 1s 


computed. These vectors linearly span the support of the projector wi given 
in (7.157). E 

5. Denote by lope the computational basis vectors of H,_7¢. If n = £ 231 
then let D := D, and let |x) := |u). Otherwise, compute |uz) ® |¢;) for every 
ke{l,..., D} andi ce {l,..., gn—nty The resulting set of vectors {sey 
has cardinality D := D - 2"-* In both cases, the resulting vectors |xg) € Hy, 


span the support of the projector o% = Pn. 


6. The set {|xx)} ie , 1s completed to linearly span the whole space H.,,. This will be 
accomplished as follows. Consider the sequence of vectors 


fan ee = fis? open, 


j= j= j=l 


where CATA , denotes the computational basis vectors of H}. Find the smallest 


i such that |x;) can be written as a linear combination of | |x; ) | , and discard 


it (this can still be decided exactly, since all the vectors are given as tables of 
integers). Repeat this step D times until there remain only 2” linearly independent 
vectors, namely all the |x;) and 2” — D of the |;). 


7. Finally, apply the Gram-Schmidt orthonormalization procedure to the resulting 
2” P 
vectors, to get an orthonormal basis | | yx) | of H,,, such that the first D vectors 


are a basis for the support of o” = pn. Since every vector |x;) and |®;) has 


only integer entries, all the resulting vectors |y) will have only entries that are 
(plus or minus) the square root of some rational number. 


Up to this point, the previous steps did not involve any kind of numerical approx- 
imation. Instead, the next ones will compute an approximate description of the 
desired unitary decompression map U and apply it to the quantum state p. In view 
of Remark 9.1.3, the task is to calculate the number N of bits necessary to guarantee 
that the output will be within trace-distance ô = 1/k of P. 


8. Read the value of k (which denotes an approximation parameter; the larger k, 
the more accurate the output of the algorithm will be). Due to the considerations 
above and the calculations below, the necessary number of bits N turns out to 
be N=1+4+ [log (2k2" 10/2" yi Compute this number. Then, compute the 
components of all the vectors {|y,) kai up to N bits of accuracy. (This involves 
only calculation of the square root of rational numbers, which can be done to 
any desired accuracy.) Denote the resulting numerically approximated vectors by 
|Yz) and write them as columns into an array (a matrix) U:= (V1, ¥2,--+, yor). 
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Let U := (y1, y2,.--, y2") denote the unitary matrix with the exact vectors | yx) 
as columns. Since N binary digits give an accuracy of 2~,, it follows that 


1/k 


U; ; — U; (atd rT 
| _ ae 2. 2"(10./2")2" 


If two 2” x 2”-matrices U and U are e-close in their entries, they must be 
2”. e-close in norm, too. Whence we get 


1/k 


Ŭ — U] < ; 
i l 210/27)?" 


So far, every step could have been performed on a classical computer; the intrinsically 
quantum part starts when one consider the qubit string p, that is the input quantum 
program. 


9. Compute [nr], which gives the length £(p). Afterwards, move p to some free 
space on the input tape, and append zeroes, i.e. create the state 


p' = Io) (bol = (10) (OPP P @ p 


on some segment of n cells on the input tape. 

10. According to Remark 9.1.3, apply a unitary approximation to the unitary trans- 
formation U on the tape segment that contains the state p’, move the result onto 
the output tape and halt. 


Proof of Proposition 9.2.1 The triple (n, r, q) can be encoded into a single qubit 
string o (note that the parameter k is not a part of ø) as follows. First, write both 
r and n in a self-delimiting way as computational basis vectors |(r)), respectively 
|G(n)) (see Definition 9.2.1), of length 2 log, r + 2, respectively 2 log, n + 2. 

Then, consider the projectors P, := |G(n))(G(n)|, P, := |B(r))(G(r)| and attach 
to them the rotated projector p = UÝ pU, so that the resulting input qubit string is 
o(p) := P, & Pa ® p. If n fulfils (7.162), then 


L(a(p)) = 2|logon] +2+c+ [nr], 


where C(r) € N is some constant which depends on C (r), but not on n. 

This qubit string is presented to the UQTM {Y together with a description of the 
decompression algorithm D of fixed length C’(r) which depends on r, but not on n. 
This will give a qubit string oy(p) of length 


L(oy(p)) = 2[logyn| + 2+ C(r) + [nr] + C'(r) 


1 1 
< 2loggn +n s+5ô +C (r), 
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where C’(r) is again a constant which depends on r, but not on n. The matrix 
U, whose construction is part of the decompression algorithm 9, rotates (decom- 
presses) a compressed (short) qubit string p back into the typical subspace. Con- 


versely, for every one-dimensional projector p < Pn, where Pn = og” was defined 
in (7.161), let p € Hynr] be the projector given by (|0)(0/)@"-/"") @ p = Ut pu. 
Then, since D is such that the trace-distance fulfils D(iout p), k), P) < i for every 
k € N, it follows that 


1 Pe lo 
—00 (20 —— 
n n 


1 EN 
"ts+ ô+ oe 
2 n 


If n is large enough, then the first inequality in Proposition 9.2.1 follows, while the 
second inequality is proved by letting k := [5]. Then, for every one-dimensional 
projector p < Pn and n large enough 


1/k 


1 PDE! ~l ~ . 2{log,k] +2 
— QC? (P) < —QCy!" (P) < —QC, (P) + ce 
n n n n 


2 log, k +2 
Pe ES 
n 


<s +28, (9.15) 


where the first inequality follows from the obvious monotonicity property ô > € = 
Qc? (p) < QC; (p), the second one is by Example 9.2.1.2 and the third estimate is 
due to the first inequality in Proposition 9.2.1. 


Proof of Theorem 9.2.2 Let py (6) be the typical projector sequence given in Proposi- 
tion 9.2.1, i.e. the complexities —QC, (p) and —Qc? (p) of every one-dimensional 
n 

projector p < p,(0) are upperbounded by s + ô. 

Due to Corollary 9.2.1, there exists another sequence of typical projectors my (8) < 
fe ‘a 1 ee 
Pn(0) such that additionally, —QC? (m) > s — 6(2+ 6)s is satisfied for all one- 

n 

dimensional projections 7 < mp (ô). 

Also, from Corollary 9.2.2, there is another sequence of typical projectors Ty (6) < 
Tn(Ô) such that —QC, (T) > s — 6 holds for all one-dimensional projections 7 < 

n 

Tn (0). 

Further, the optimality of these upper and lower bounds, and thus of s as optimal 


expected asymptotic complexity rate, follows from applying Lemma 9.2.1 together 
with Proposition 7.6.1. 


Remark 9.2.4 Unlike in Theorem 4.2.1 where the result holds almost everywhere, 
its quantum generalization given above essentially holds in probability. The major 
obstruction to a stronger quantum version comes form the difficulty of extending to 
qubit strings what is natural for bit strings, namely their concatenation [76]. 


Example 9.2.2 Consider a quantum spin chain (A, w) of Bernoulli type with a state 
w which is the tensor product of tracial states o = Il2/2 for each qubit, this quantum 
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source is mixing, thus ergodic and its entropy rate is s = —Trp log, p = 1. Then, the 
quantum version of Brudno’s theorem states that there exists a sequence of subspaces 
K, © Hr of high probability, such that for any £ > 0, by taking n sufficiently large, 


1c <-QC, (YW) <1 te, 
n 


for all qubit pure state W € Kı. 


9.3 bit Quantum Complexity 


A different approach to quantum algorithmic complexity is proposed in [365] where 
as effective descriptions of n-qubit strings | ¥ ) € H, one chooses bit strings cor- 
responding to self-delimiting classical programs p € $25 instead of generic qubit 
strings. These classical programs are presented to a fixed UQTM {4 as computational 
basis vectors | p) which, after being processed by M, outputs normalized vectors 
| Ucp) ) € Hn. Furthermore, the difference between the output | £{(p) ) and the target 
| W ) is taken care of by the scalar product ( W | U[p]). 


Definition 9.3.1 (bit Quantum Complexity) The bit quantum complexity QC, (W) 
of n-qubit vector states | W ) € H, is 


QC, (H) = min{e(p) + | — 10g IY [e])? ]f 
where p € 925 is any self-delimiting binary program. 


The logarithmic correction acts as a penalty for bad approximations: 
— log, |( Y | U[ p] )| diverges for an effective description of W which yields a vector 
orthogonal to it, while it vanishes when | £{(p)) ~ | ¥ ). Therefore, the bit quantum 
complexity results from a tradeoff between the length of the classical description 
and the permitted errors. 


Example 9.3.1 Let a vector W € H, be called directly computable if there exists 
a self-delimiting program p € $23 such that | {(p)) =|). Then, consider an 
orthonormal basis B := {| b; 2 in H, entirely consisting of directly computable 
vectors. Let K (5) denote its classical prefix-complexity achieved by a self-delimiting 
program qg, K(B) = (qg). Let us fix |b; ) € B; if pi is any program such that 
| L{(p;) ) = |b; ) then no penalty for a bad approximation is to be payed and, with 
px the shortest among such programs, 


QC, (Y) < Ep). —(*) 
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On the other hand, let QC, (W) be attained at g, € 2}, namely 


QC. (H) = CCG) + | = logs MG YI? |. 


By letting U process the binary programs in dovetailed fashion (see Remark 4.1.5), qx 
can be used to construct the vector | U(qx) ) € H, whose coefficients (b; | L(g.) ) 
in the expansion with respect to the ONBB provide probabilities |(b; | (qx) ) K 
that can be used to construct a Shannon-Fano-Elias code-word q (i) for |b; ) (see 
Example 3.2.3). Therefore, gp, qx and q (i) can be used to construct a self-delimiting 
program q = qgqxq (i) such that 4 does the following: 


e it constructs the directly computable basis 6 and the vector | U(qx) ); 

e it computes the Shannon-Fano-Elias code for B with respect to | U(qgx) ); 
e it outputs the vector with code-word q (i). 

Since | L(g) ) = | b; ), from (*) one gets 


(px) < £l) < gx) + La) + KB) + C = QC, (WH) + KB) + C, (ex) 


whence, up to an additive constant, 
QC.) = minfe(p) : |(p)) =1¥)| 
for all W belonging to a directly computable ONB. 


The preceding example can be used to show that bit quantum complexity and 
classical prefix complexity agree on bit strings. 


Proposition 9.3.1 For alli € 23, QC, (|i )) = K(i) up to an additive constant. 


Proof Choosing the computational basis {| i” Pw € ay as the directly com- 
putable ONBB of the previous example, the result follows from (**) because the 
shortest program that tells 4 how to generate 6 is now such that (gz) = O(1). 


For generic qubit strings, a loose upper bound is easily obtained. 
Proposition 9.3.2 [365] If V} € Hn is normalized 
QC. (W) <2n +C, 


where C is a constant independent of W. 
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Proof Consider the computational basis vectors |i ™) e Hn; by expanding 
Jy) = Deas ci™)|i™), there must be at least one i{” such that 
2 


Ic w) [2 > 2™,Letp € 925 be a self-delimiting program such that, by literal tran- 


scription, | U(p)) = |i w) ). Then, with such a choice of effective description of | ¥ ) 


one gets the upper bound 


QC. (H) < Lp) + [log (IY )P] S20 + C. 


A lower bound to the bit quantum complexity of a subset of W € H, can be 
obtained following an argument developed in [143]. For any W € H, and a > 0, let 
us define the subsets 


2} 2 Ma) = |p € BF: -log |(M(p)|¥)P <a 


and the quantities QC, (W) := min{¢(p) : p € Ha(¥)}. Ifa > p, then I7g(W) C 
IT,(W), whence 


a> B= > QCg(W) = QC (Y) = QC (WY) - 


Notice that QC,,(W) is the length of the shortest classical programs p such that 
| LU[p]) is non-orthogonal to |W), |(LU(p)|W)| > 0. Therefore, if QC, (W) is 
attained at q, that is if 


QC. WH) = tq) + |- 10g (Mg) YP | 
<—— al 
B 


then, €(¢) > OCg(W) > QCa(¥) for all a > p and 
QC, (W) = QC (W) + B. (9.16) 


The following Lemma shows that there are vectors W € H, for which QC,,(W) 
cannot be small. 


Lemma 9.3.1 Let K(d) C H, be a d-dimensional subspace; then, for all 0 < a < 
logy d there exists a subspace Ka C K(d) of dimension da > d — 2° such that 
QC. (W) = a for all Y € Kg. 


Proof According to (4.6), there are less than 2° programs p € 925 with €(p) < a; 
it follows that the subspace H(a) linearly spanned by the corresponding vectors 
| UL p]) has dimension < 2%. Let K(d) € H, be any subspace of dimension d > 24 
and choose Ka C K(d) orthogonal to H(a) and thus of dimension da > d — 2°. 
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Now, | Y ) € Kg satisfies QC,,(W) > a, unless there is a program p with £(p) < a 
with (U(p) |W) Æ 0; this is impossible since, by construction, | W ) is orthogonal 
to the linear span of | U[p]) with £(p) < a. 


Unlike for the qubit quantum complexity QC, where the corresponding complex- 
ity rate could be controlled by means of high probability subspaces, in the case of 
the bit quantum complexity QC., one has to argue in terms of volumes of vectors. 
Indeed, one can estimate how many unit vectors |“) € Ka satisfy QC,(W) <r. 
This will be done by representing W as a point u € R“ on the unit sphere Sq, 
whose coordinates are the real and imaginary parts of the Fourier coefficients of the 
expansion of | Y ) with respect to a chosen ONBin the subspace Ka. 

Let S24, (0) denote the area of the sector of S24, consisting of unit vectors u € R24 
which have scalar product 1 > u - e > cos@ with respect to a fixed vector e; it is 
expressed by 


6 
Sra, (0) = i dọ Aza,—1(sin ¢) , 


where 


with T(z) the Euler Gamma function, is the area of the unit sphere S, in R” of radius 
t (notice that area of the unit sphere and area of the sector of angle 0 are related by 
An(1) = S,(7)). The sector area can be bounded from above as follows: 


2pda—1/2 eee Yq da—1/2 


—— =l a C 0) 
Ta, I/D ho $S Fa, y 0. (9.17) 


S24, (0) = 


Let Soa, (0) denote the sector area S2q,(0) normalized to that of the unit sphere, 
Aza, (1); then, 


S24, (0) = Qnta—1/2 I (da) in2da— D) 9 
A24,(1) T (da — 1/2) QnA 


< dye 7D O- (9.19) 


S24, (0) := (9.18) 


where the last inequality comes from expanding f (0) := log sin 0 around 7/2, 
1 2 1 H 3 1 2 a 
fO) = -30 -7/2 + ef (9) @ — 7/2) < -3077/2 >» O<60<7/2, 


and from the fact that f K (0) > 0. We now use (9.19) to estimate the relative volume, 
Fa(p, œ), of the subset 


Falp: a) = {lb} € Ka : — loga (UP) 14)? < a] 
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consisting of vectors with penalty smaller than œ with respect to a given output 
|L[p]). Since 27% < |(U[p]| v)| = cos = sin(a/2 — 0) < 1/2 — 9, 


Fa (p, @)) < dye“? ™ 
From this inequality we further deduce 
Lemma 9.3.2 The relative volume, F? (a), of the set 
Fi (a) = fip) € Ka : QCA) <r] 
has relative volume F? (a) such that 
F; (œ) < dqg2" eo Gao 
Proof If |) € Kg is such that QC, (Y) < r, then — log, |( | U[p] )/? < a for at 


least one program p with £(p) < r; the result then follows since there are < 2” such 
programs. 


The complement G/(a) of F/ (a) consists of |Y) € Ka such that either 
— logy |( | Up] )|? = aor— log, | (Y | ULp])||? < a, but £(p) > r. Inother words, 
from (9.16) and Lemma 9.3.1, it turns out that G} (a) consists of | 7) ) € Ka such that 


QC, (Y) = QC, (W) +azata (9.20) 


or 


QC, (9) > r — log, (y Mp] IP =r. (9.21) 


Notice that the relative volume G7 (a) of G} (a) is large, G7 (a) > 1 — cif the relative 
volume of F} (œ) is small, F? (œa) < £. Lemma 9.3.2 can then be used to prove that 
for a large fraction of vectors | Y} € Kg one has QC, (Y) ~ 2n when n > oo. 


Proposition 9.3.3 For any £ > 0 and N > n large enough, there exists a subspace 
Kn-1 C Hn of dimension > PH containing a subset Gn—1ı of relative volume 
Gn-1 => 1 — e such that, forall | Y ) € Gy-1, 


C. (Y 
popa POON anys 
n 


Proof Choose H (d) = H, in Lemma 9.3.1 and a = n — 1; then, there exists a sub- 
space K,—; C H, of dimension dp-1 > 2” — 2”7=1 = 2"—! such that QCA (WY) = 
n—1 forall W € K,_}. Setting r = 2n and a =n — 1 — 2log, n in Lemma 9.3.2 
one gets 


—nt+ly,2 4 
F?” (n —1-— 2log, n) < e 0-2 + Gn Dog? 
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Thus, for n sufficiently large, the subset G,-; C Ky_1 of W € Ka-1 that violate 
QC; -1-2 10g, n(¥) < 2n has relative volume Gn—| > 1—e. The result then fol- 
lows by applying the lower bound in Proposition 9.3.2 and the upper bounds (9.20) 
and (9.21). 


Remark 9.3.1 Proposition 9.3.3 states that a large fraction of n-qubit vector states 
belonging to a subspace of dimension not less than 2”! has a bit quantum complexity 
per symbol close to 2. Notice that this is twice the qubit quantum complexity per 
symbol of all pure states in the high probability subspaces of a Bernoulli quantum 
source (see Example 9.2.2). However, in the latter case the fact that the complexity 
rate ~ | follows from the specific structure of the state w on the quantum spin chain 
A. Instead, the result of Proposition 9.3.3 does not refer to the considered n qubits 
belonging to a quantum chain and thus to a reference global state w. Indeed, the 
weights of the subsets of vectors with bit quantum complexity rate ~ 2 are estimated 
in terms of relative volumes instead of probabilities as in Theorem 9.2.2. 


9.3.1 Circuit Algorithmic Complexity 


Any state |’) € H, of n qubits can be obtained as the result of an action on a fixed 
state | ®) € H, by a suitable unitary operator U : H, + H,. From Remark 9.1.1.2 
we know that the action of U can be approximated within any £ > 0 by means of 
a quantum circuit, that is by another unitary operator V : H, +> Hn, consisting of 
N(U, £) gates from a complete gate basis G. Furthermore, to leading order in the 
number of qubits and of the accuracy £, the number of gates scales as N(U, £) = 
1 
O (2" log -) . 
E 

This fact is essential in the definition of quantum algorithmic complexity proposed 
in [247-249] where the focus is not on the effective description of the n-qubit states, 
whether quantum or classical, rather on the effective description of the quantum 
circuits that can be used to effectively construct those quantum states up to a certain 
fixed accuracy £. 

Given a complete gate basis G, the fixed ready state | ® ) and the accuracy parame- 
tere > 0,asame | W ) can be reached up to € by a certain set ee of quantum circuits 
Vý © that will be identified with their unitary actions on | ® ). Let the description of 
any of these circuits be encoded by a binary string i vg € 923; then 


Definition 9.3.2 (Circuit Quantum Complexity) Let a C Q% be the subset of 
strings that encode the circuits in eg; the circuit quantum algorithmic complexity 
of an n-qubit state | W ) € H, is the least classical prefix algorithmic complexity of 
the strings in a 


net 


ack! wy) := min {KG yo.) : iyge € ag] . 
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Remark 9.3.2 The dependence of QCnet on the encoding of the description of the 
circuits vg Ee ege can be handled as in classical algorithmic complexity: a change 
of code is taken care of by a finite additive constant corresponding to a suitable 
dictionary which is thus independent of the circuit described. 


The physical motivation behind such a definition is that, after all, quantum states 
can be prepared by means of arrays of unitary gates that can be effectively described; 
then, the idea is to relate the complexity of vector states to the degree of compressibil- 
ity of the descriptions of the quantum circuits that provide suitable approximations 
to them. 


Example 9.3.2 If one want to reproduce a bit string i by a quantum circuit, the 
first step is to associate it to a qubit vector state | i ) of the so called computational 
basis (see Sect. 4.1.1), where 


[§) =[i1)@li2)@---lin), tf =0,1, alij) =(—D* i). 


This state can then easily be obtained by flipping with a the j-th qubit of |0)®". 
The corresponding quantum circuit consists of n 1-qubit gates, either trivially the 
identity matrix Il or the Pauli matrix g4 ; therefore, an upper bound to the algorithmic 
complexity of the classical description of such a quantum circuit is easily seen to 
scale as n, exactly as the Kolmogorov complexity of a generic bit string of length n 
(see Proposition 4.1.1). 


Within this approach one usually estimates the complexity by upper bounds that 
depend on results as the one quoted in Remark 9.1.1.2, whence the circuit complexity 
of a state |W) of n qubits can be estimated as follows: 


QCH (W) = O (n°? log 1/8) , (9.22) 
where f(n) = O(g(n)) if there exists Cf g > 0 such that | f (n)| < Cf,glg(n)l. 


Remark 9.3.3 As already pointed out (see for instance Example 3.2.2), a bit string 
i™ of length n can be associated with an interval in [0, 1] of length 27”; this latter 
can also be interpreted as the volume of the subset V (i 0) of bit strings of any 
length that are prefixed by i (") Therefore, the upper bound (4.5) to the algorithmic 
complexity of i” scales as — log V(i™) = n. In a quantum context, because of the 
lack of discreteness, given a fixed n-qubit vector Y, one can in general only hope to 
construct it within an error £; namely, a quantum circuit can be devised that outputs 
|y) e (C2)®” such that |( |W )| > 1 — e for some accuracy parameter 0 < € < 1. 
Using (9.17), one finds that the logarithm of the volume V-(W) of such a cone scales 
as 2” log £ in agreement with (9.22). 
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Notice that the upper bound to the bit quantum complexity in Proposition 9.3.2 is 
obtained by choosing |  ) such that |( | W )}| > 27”; this oe to a parameter 
in (9.22) which scales as € ~ 1 — 2~” and to an upper bound to ac’: € (W) which is 


only polinomially different from the one to QC, (W) [249]. 


net 


By means of the upper bound (9.22), it is possible to put into evidence the dif- 
ference in circuit complexity between separable and entangled states. The important 
J 


point is that a product state | W } = Q | ©; ) € Hn can be constructed with accuracy 
j=l 

E by means of n circuits that construct the vectors | ®; ) with accuracy e/n [247]. 

Suppose that the state | W ) shows some entanglement between its constituent qubits 

; namely, |W) = Qi | Ð; ), where pam nj =nand|®;) € H,, are entangled 

states of n; qubits. Then, one obtains 


QCh (Y) = (5 22 log =). 


For sufficiently small £, the sum is upper bounded by 


J 
J J 1 
XO n42" log — < n?2" log > < n?2” log- . 
u € 27-1 a ë 


Therefore, the upper bound is the largest when the state | W ) is completely entangled, 
that is when J = | and nj = n; vice versa, in the case of complete separability, 
namely when J = n and nj = 1, one gets 


n 
acd: (w) = O(2n log *) 


with polynomial instead of exponential increase with the number of qubits. 


9.3.2 Quantum Universal Semi-density Matrix 


Like in Sect. 9.3, the quantum extension of classical algorithmic complexity proposed 
in [143] starts from the classical description of quantum states | W )} € H, of n qubit 
systems by means of bit strings i € 623. However, these descriptions are not consid- 
ered as programs of a certain length which has to be minimized, rather as bit strings 
characterized by a given universal probability Py as explained in Remark 4.3.2.3. 
For instance, if a state | W ) can be expanded with respect to the computational 
basis {| i” BROF 2” by means of coefficients which are exactly computable by a 


program j € 2% processed by a fixed UTM 4, then 


m(W) := Py(j) 
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naturally represents the universal probability of this state. Exactly computable states 
are termed elementary as well as linear operators X on H, whose matrix elements 
with respect to the computational basis can be exactly computed; operators which 
can be approximated from below by an increasing sequence of elementary operators 
are called lower semi-computable (see Definition 4.1.4). Then, an argument similar 
to the one in Example 4.1.7.3 leads to the following result [143]. 


Theorem 9.3.1 A lower semi-computable semi-density matrix p € Bı (Hy), namely 
p Z0 and p < l, can be effectively constructed such that, for any other semi- 
computable semi-density matrix o € Bı (H), there is a constant Co for which 
Cso < p. Moreover, p can be identified with 


p= dy ma) | Yer) er, 
| Wer) EHn 


where the sum runs over all elementary vector states of n qubits. 


The operator pis a convex combination over elementary projections weighted with 
their universal probabilities; since the universal probability Py is not normalized, 
neither is p. Inspired by Remark 4.3.3, it is thus suggestive to introduce an operatorial 
complexity 

kK := — log p, 
and two possible definitions of algorithmic complexity of a state | ¥ }: 


QC p (Y) = -log (Hl plI¥)), QCl, = (WIKI). 
From the the concavity of the function f(x) = log, x it follows that 
OC; = OC WM). 


Indeed (see the proof of Proposition 5.5.4), with p= };; 


i ri |ri){ri | the spectral 
decomposition of p, 


log ((Y | p|W )) = log, (x ri Kw in 
> SOM [iP logs ri = (¥ | logs pv). 


l 


For bitstrings i” € (25 the two possibilities coincide with the algorithmic complex- 

ity K(@™). Indeed, consider the computational basis vectors i, then 

{(i | pli )},oegm is a semi-measure. As m is a universal semi-measure on 
2 


Q7 it follows that there exists a constant C, such that 


Cp (i™® | pli) < ma™), 
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whence, from Remark 4.3.3, QC,_,, @) = KG) + 001). 

On the other hand, p = Liwer m(i™) |i™ )(i™ |isalower semi-computable, 
semi-density matrix. Therefore, the monotonicity of the logarithm as an operator 
function (see Example 5.2.3.6) yields 


Cop < p= -log p + log Cp 2k, 


whence QC}, @) < KG™) + 00). 

The operatorial complexity « has an interesting similarity with the classical algo- 
rithmic complexity in that its mean value with respect to a lower semi-computable 
density matrix p equals its von Neumann entropy up to an additive constant [143] 
(compare Corollary 4.3.2), 


Tr(p) = So(p) + O), S2(p) := —Tr(p log P) . 


Setting p := a the positivity of the relative entropy, S (p, p) > 0, yields 


S2(p) < Tr(pK) + log, Tr(p) . 
By assumption, there exists a constant C, such that Cp p < p; thus, as before, 
— log, Cp — logy p => k => S(p) = Tr(pR)+ O(1). 


Remark 9.3.4 The topics addressed in this chapter are relatively recent and still 
in their infancy so that the relations between the various extensions of classical 
algorithmic complexity theory to quantum systems are largely to be explored (a 
discussion of those between Vitanyi’s and Gács’ proposals can be found in [143]). 

Further, beside the previous result and Theorem 9.2.2, the connections between 
quantum algorithmic complexities and the von Neumann entropy or the von Neumann 
entropy rate have not yet been clarified. In particular, the randomness of the quantum 
dynamics, rather than of quantum states have not been tackled yet; namely, a quantum 
extension of the dynamical version of Brudno’s theorem (see Corollary 4.2.1) is still 
missing. 
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